1356 Commits

Author SHA1 Message Date
Alexander Mikhalitsyn
23b763302c tools headers: Sync uapi/asm-generic/socket.h with the kernel sources
This also fixes a wrong definitions for SCM_TS_OPT_ID & SO_RCVPRIORITY.

Accidentally found while working on another patchset.

Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jason Xing <kerneljasonxing@gmail.com>
Cc: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Fixes: a89568e9be ("selftests: txtimestamp: add SCM_TS_OPT_ID test")
Fixes: e45469e594 ("sock: Introduce SO_RCVPRIORITY socket option")
Link: https://lore.kernel.org/netdev/20250314195257.34854-1-kuniyu@amazon.com/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250314214155.16046-1-aleksandr.mikhalitsyn@canonical.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 15:14:46 +01:00
Linus Torvalds
d0d106a2bd Merge tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Pull bpf updates from Alexei Starovoitov:
 "A smaller than usual release cycle.

  The main changes are:

   - Prepare selftest to run with GCC-BPF backend (Ihor Solodrai)

     In addition to LLVM-BPF runs the BPF CI now runs GCC-BPF in compile
     only mode. Half of the tests are failing, since support for
     btf_decl_tag is still WIP, but this is a great milestone.

   - Convert various samples/bpf to selftests/bpf/test_progs format
     (Alexis Lothoré and Bastien Curutchet)

   - Teach verifier to recognize that array lookup with constant
     in-range index will always succeed (Daniel Xu)

   - Cleanup migrate disable scope in BPF maps (Hou Tao)

   - Fix bpf_timer destroy path in PREEMPT_RT (Hou Tao)

   - Always use bpf_mem_alloc in bpf_local_storage in PREEMPT_RT (Martin
     KaFai Lau)

   - Refactor verifier lock support (Kumar Kartikeya Dwivedi)

     This is a prerequisite for upcoming resilient spin lock.

   - Remove excessive 'may_goto +0' instructions in the verifier that
     LLVM leaves when unrolls the loops (Yonghong Song)

   - Remove unhelpful bpf_probe_write_user() warning message (Marco
     Elver)

   - Add fd_array_cnt attribute for prog_load command (Anton Protopopov)

     This is a prerequisite for upcoming support for static_branch"

* tag 'bpf-next-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (125 commits)
  selftests/bpf: Add some tests related to 'may_goto 0' insns
  bpf: Remove 'may_goto 0' instruction in opt_remove_nops()
  bpf: Allow 'may_goto 0' instruction in verifier
  selftests/bpf: Add test case for the freeing of bpf_timer
  bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT
  bpf: Free element after unlock in __htab_map_lookup_and_delete_elem()
  bpf: Bail out early in __htab_map_lookup_and_delete_elem()
  bpf: Free special fields after unlock in htab_lru_map_delete_node()
  tools: Sync if_xdp.h uapi tooling header
  libbpf: Work around kernel inconsistently stripping '.llvm.' suffix
  bpf: selftests: verifier: Add nullness elision tests
  bpf: verifier: Support eliding map lookup nullness
  bpf: verifier: Refactor helper access type tracking
  bpf: tcp: Mark bpf_load_hdr_opt() arg2 as read-write
  bpf: verifier: Add missing newline on verbose() call
  selftests/bpf: Add distilled BTF test about marking BTF_IS_EMBEDDED
  libbpf: Fix incorrect traversal end type ID when marking BTF_IS_EMBEDDED
  libbpf: Fix return zero when elf_begin failed
  selftests/bpf: Fix btf leak on new btf alloc failure in btf_distill test
  veristat: Load struct_ops programs only once
  ...
2025-01-23 08:04:07 -08:00
Vishal Chourasia
01f3ce5328 tools: Sync if_xdp.h uapi tooling header
Sync if_xdp.h uapi header to remove following warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h'
  differs from latest version at 'include/uapi/linux/if_xdp.h'

Fixes: 48eb03dd26 ("xsk: Add TX timestamp and TX checksum offload support")
Signed-off-by: Vishal Chourasia <vishalc@linux.ibm.com>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20250115032248.125742-1-yoong.siang.song@intel.com
2025-01-17 15:49:16 +01:00
Jakub Kicinski
a8a6531164 Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2025-01-07

We've added 7 non-merge commits during the last 32 day(s) which contain
a total of 11 files changed, 190 insertions(+), 103 deletions(-).

The main changes are:

1) Migrate the test_xdp_meta.sh BPF selftest into test_progs
   framework, from Bastien Curutchet.

2) Add ability to configure head/tailroom for netkit devices,
   from Daniel Borkmann.

3) Fixes and improvements to the xdp_hw_metadata selftest,
   from Song Yoong Siang.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  selftests/bpf: Extend netkit tests to validate set {head,tail}room
  netkit: Add add netkit {head,tail}room to rt_link.yaml
  netkit: Allow for configuring needed_{head,tail}room
  selftests/bpf: Migrate test_xdp_meta.sh into xdp_context_test_run.c
  selftests/bpf: test_xdp_meta: Rename BPF sections
  selftests/bpf: Enable Tx hwtstamp in xdp_hw_metadata
  selftests/bpf: Actuate tx_metadata_len in xdp_hw_metadata
====================

Link: https://patch.msgid.link/20250107130908.143644-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 15:39:09 -08:00
Daniel Borkmann
b9ed315d3c netkit: Allow for configuring needed_{head,tail}room
Allow the user to configure needed_{head,tail}room for both netkit
devices. The idea is similar to 163e529200 ("veth: implement
ndo_set_rx_headroom") with the difference that the two parameters
can be specified upon device creation. By default the current behavior
stays as is which is needed_{head,tail}room is 0.

In case of Cilium, for example, the netkit devices are not enslaved
into a bridge or openvswitch device (rather, BPF-based redirection
is used out of tcx), and as such these parameters are not propagated
into the Pod's netns via peer device.

Given Cilium can run in vxlan/geneve tunneling mode (needed_headroom)
and/or be used in combination with WireGuard (needed_{head,tail}room),
allow the Cilium CNI plugin to specify these two upon netkit device
creation.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/bpf/20241220234658.490686-1-daniel@iogearbox.net
2025-01-06 09:48:49 +01:00
Jakub Kicinski
385f186aba Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc6).

No conflicts.

Adjacent changes:

include/linux/if_vlan.h
  f91a5b8089 ("af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK")
  3f330db306 ("net: reformat kdoc return statements")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-03 16:29:29 -08:00
Linus Torvalds
f0bc704f46 Merge tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening fix from Kees Cook:

 - stddef: make __struct_group() UAPI C++-friendly (Alexander Lobakin)

* tag 'hardening-v6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  stddef: make __struct_group() UAPI C++-friendly
2024-12-27 10:39:05 -08:00
Alexander Lobakin
724c6ce38b stddef: make __struct_group() UAPI C++-friendly
For the most part of the C++ history, it couldn't have type
declarations inside anonymous unions for different reasons. At the
same time, __struct_group() relies on the latters, so when the @TAG
argument is not empty, C++ code doesn't want to build (even under
`extern "C"`):

../linux/include/uapi/linux/pkt_cls.h:25:24: error:
'struct tc_u32_sel::<unnamed union>::tc_u32_sel_hdr,' invalid;
an anonymous union may only have public non-static data members
[-fpermissive]

The safest way to fix this without trying to switch standards (which
is impossible in UAPI anyway) etc., is to disable tag declaration
for that language. This won't break anything since for now it's not
buildable at all.
Use a separate definition for __struct_group() when __cplusplus is
defined to mitigate the error, including the version from tools/.

Fixes: 50d7bd38c3 ("stddef: Introduce struct_group() helper macro")
Reported-by: Christopher Ferris <cferris@google.com>
Closes: https://lore.kernel.org/linux-hardening/Z1HZpe3WE5As8UAz@google.com
Suggested-by: Kees Cook <kees@kernel.org> # __struct_group_tag()
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20241219135734.2130002-1-aleksander.lobakin@intel.com
Signed-off-by: Kees Cook <kees@kernel.org>
2024-12-20 09:05:53 -08:00
Jakub Kicinski
07e5c4eb94 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc4).

No conflicts.

Adjacent changes:

drivers/net/ethernet/renesas/rswitch.h
  32fd46f5b6 ("net: renesas: rswitch: remove speed from gwca structure")
  922b4b955a ("net: renesas: rswitch: rework ts tags management")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-19 11:35:07 -08:00
Anna Emese Nyiri
e45469e594 sock: Introduce SO_RCVPRIORITY socket option
Add new socket option, SO_RCVPRIORITY, to include SO_PRIORITY in the
ancillary data returned by recvmsg().
This is analogous to the existing support for SO_RCVMARK,
as implemented in commit 6fd1d51cfa ("net: SO_RCVMARK socket option
for SO_MARK with recvmsg()").

Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Link: https://patch.msgid.link/20241213084457.45120-5-annaemesenyiri@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-16 18:16:44 -08:00
Alexei Starovoitov
06103dccbb Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Cross-merge bpf fixes after downstream PR.

No conflicts.

Adjacent changes in:
Auto-merging include/linux/bpf.h
Auto-merging include/linux/bpf_verifier.h
Auto-merging kernel/bpf/btf.c
Auto-merging kernel/bpf/verifier.c
Auto-merging kernel/trace/bpf_trace.c
Auto-merging tools/testing/selftests/bpf/progs/test_tp_btf_nullable.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-16 08:53:59 -08:00
Anton Protopopov
4d3ae294f9 bpf: Add fd_array_cnt attribute for prog_load
The fd_array attribute of the BPF_PROG_LOAD syscall may contain a set
of file descriptors: maps or btfs. This field was introduced as a
sparse array. Introduce a new attribute, fd_array_cnt, which, if
present, indicates that the fd_array is a continuous array of the
corresponding length.

If fd_array_cnt is non-zero, then every map in the fd_array will be
bound to the program, as if it was used by the program. This
functionality is similar to the BPF_PROG_BIND_MAP syscall, but such
maps can be used by the verifier during the program load.

Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241213130934.1087929-5-aspsk@isovalent.com
2024-12-13 14:48:36 -08:00
Namhyung Kim
3cef7d8b12 tools headers: Sync uapi/asm-generic/mman.h with the kernel sources
To pick up the changes in this cset:

  3630e82ab6 ("mman: Add map_shadow_stack() flags")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/mman.h include/uapi/asm-generic/mman.h

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mark Brown <broonie@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20241203035349.1901262-8-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:50 -08:00
Namhyung Kim
81b483f722 tools headers: Sync *xattrat syscall changes with the kernel sources
To pick up the changes in this cset:

  6140be90ec ("fs/xattr: add *at family syscalls")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

The arm64 changes are not included as it requires more changes in the
tools.  It'll be worked for the later cycle.

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
CC: x86@kernel.org
CC: linux-mips@vger.kernel.org
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-s390@vger.kernel.org
Link: https://lore.kernel.org/r/20241203035349.1901262-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:50 -08:00
Namhyung Kim
e2064b7c5d tools headers: Sync uapi/linux/kvm.h with the kernel sources
To pick up the changes in this cset:

  e785dfacf7 ("LoongArch: KVM: Add PCHPIC device support")
  2e8b9df826 ("LoongArch: KVM: Add EIOINTC device support")
  c532de5a67 ("LoongArch: KVM: Add IPI device support")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Tianrui Zhao <zhaotianrui@loongson.cn>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: kvm@vger.kernel.org
Cc: loongarch@lists.linux.dev
Link: https://lore.kernel.org/r/20241203035349.1901262-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:49 -08:00
Namhyung Kim
5229df8fb6 tools headers: Sync uapi/linux/perf_event.h with the kernel sources
To pick up the changes in this cset:

  18d92bb57c ("perf/core: Add aux_pause, aux_resume, aux_start_paused")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20241203035349.1901262-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:49 -08:00
Namhyung Kim
5fc3a088ee tools headers: Sync uapi/drm/drm.h with the kernel sources
To pick up the changes in this cset:

  56c594d8df ("drm: add DRM_SET_CLIENT_NAME ioctl")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Simona Vetter <simona@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: dri-devel@lists.freedesktop.org
Link: https://lore.kernel.org/r/20241203035349.1901262-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:49 -08:00
Linus Torvalds
5c00ff742b Merge tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - The series "zram: optimal post-processing target selection" from
   Sergey Senozhatsky improves zram's post-processing selection
   algorithm. This leads to improved memory savings.

 - Wei Yang has gone to town on the mapletree code, contributing several
   series which clean up the implementation:
	- "refine mas_mab_cp()"
	- "Reduce the space to be cleared for maple_big_node"
	- "maple_tree: simplify mas_push_node()"
	- "Following cleanup after introduce mas_wr_store_type()"
	- "refine storing null"

 - The series "selftests/mm: hugetlb_fault_after_madv improvements" from
   David Hildenbrand fixes this selftest for s390.

 - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng
   implements some rationaizations and cleanups in the page mapping
   code.

 - The series "mm: optimize shadow entries removal" from Shakeel Butt
   optimizes the file truncation code by speeding up the handling of
   shadow entries.

 - The series "Remove PageKsm()" from Matthew Wilcox completes the
   migration of this flag over to being a folio-based flag.

 - The series "Unify hugetlb into arch_get_unmapped_area functions" from
   Oscar Salvador implements a bunch of consolidations and cleanups in
   the hugetlb code.

 - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain
   takes away the wp-fault time practice of turning a huge zero page
   into small pages. Instead we replace the whole thing with a THP. More
   consistent cleaner and potentiall saves a large number of pagefaults.

 - The series "percpu: Add a test case and fix for clang" from Andy
   Shevchenko enhances and fixes the kernel's built in percpu test code.

 - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett
   optimizes mremap() by avoiding doing things which we didn't need to
   do.

 - The series "Improve the tmpfs large folio read performance" from
   Baolin Wang teaches tmpfs to copy data into userspace at the folio
   size rather than as individual pages. A 20% speedup was observed.

 - The series "mm/damon/vaddr: Fix issue in
   damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON
   splitting.

 - The series "memcg-v1: fully deprecate charge moving" from Shakeel
   Butt removes the long-deprecated memcgv2 charge moving feature.

 - The series "fix error handling in mmap_region() and refactor" from
   Lorenzo Stoakes cleanup up some of the mmap() error handling and
   addresses some potential performance issues.

 - The series "x86/module: use large ROX pages for text allocations"
   from Mike Rapoport teaches x86 to use large pages for
   read-only-execute module text.

 - The series "page allocation tag compression" from Suren Baghdasaryan
   is followon maintenance work for the new page allocation profiling
   feature.

 - The series "page->index removals in mm" from Matthew Wilcox remove
   most references to page->index in mm/. A slow march towards shrinking
   struct page.

 - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs
   interface tests" from Andrew Paniakin performs maintenance work for
   DAMON's self testing code.

 - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar
   improves zswap's batching of compression and decompression. It is a
   step along the way towards using Intel IAA hardware acceleration for
   this zswap operation.

 - The series "kasan: migrate the last module test to kunit" from
   Sabyrzhan Tasbolatov completes the migration of the KASAN built-in
   tests over to the KUnit framework.

 - The series "implement lightweight guard pages" from Lorenzo Stoakes
   permits userapace to place fault-generating guard pages within a
   single VMA, rather than requiring that multiple VMAs be created for
   this. Improved efficiencies for userspace memory allocators are
   expected.

 - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses
   tracepoints to provide increased visibility into memcg stats flushing
   activity.

 - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky
   fixes a zram buglet which potentially affected performance.

 - The series "mm: add more kernel parameters to control mTHP" from
   Maíra Canal enhances our ability to control/configuremultisize THP
   from the kernel boot command line.

 - The series "kasan: few improvements on kunit tests" from Sabyrzhan
   Tasbolatov has a couple of fixups for the KASAN KUnit tests.

 - The series "mm/list_lru: Split list_lru lock into per-cgroup scope"
   from Kairui Song optimizes list_lru memory utilization when lockdep
   is enabled.

* tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits)
  cma: enforce non-zero pageblock_order during cma_init_reserved_mem()
  mm/kfence: add a new kunit test test_use_after_free_read_nofault()
  zram: fix NULL pointer in comp_algorithm_show()
  memcg/hugetlb: add hugeTLB counters to memcg
  vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event
  mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount
  zram: ZRAM_DEF_COMP should depend on ZRAM
  MAINTAINERS/MEMORY MANAGEMENT: add document files for mm
  Docs/mm/damon: recommend academic papers to read and/or cite
  mm: define general function pXd_init()
  kmemleak: iommu/iova: fix transient kmemleak false positive
  mm/list_lru: simplify the list_lru walk callback function
  mm/list_lru: split the lock to per-cgroup scope
  mm/list_lru: simplify reparenting and initial allocation
  mm/list_lru: code clean up for reparenting
  mm/list_lru: don't export list_lru_add
  mm/list_lru: don't pass unnecessary key parameters
  kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller
  kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW
  kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols
  ...
2024-11-23 09:58:07 -08:00
Linus Torvalds
fcc79e1714 Merge tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
 "The most significant set of changes is the per netns RTNL. The new
  behavior is disabled by default, regression risk should be contained.

  Notably the new config knob PTP_1588_CLOCK_VMCLOCK will inherit its
  default value from PTP_1588_CLOCK_KVM, as the first is intended to be
  a more reliable replacement for the latter.

  Core:

   - Started a very large, in-progress, effort to make the RTNL lock
     scope per network-namespace, thus reducing the lock contention
     significantly in the containerized use-case, comprising:
       - RCU-ified some relevant slices of the FIB control path
       - introduce basic per netns locking helpers
       - namespacified the IPv4 address hash table
       - remove rtnl_register{,_module}() in favour of
         rtnl_register_many()
       - refactor rtnl_{new,del,set}link() moving as much validation as
         possible out of RTNL lock
       - convert all phonet doit() and dumpit() handlers to RCU
       - convert IPv4 addresses manipulation to per-netns RTNL
       - convert virtual interface creation to per-netns RTNL
     the per-netns lock infrastructure is guarded by the
     CONFIG_DEBUG_NET_SMALL_RTNL knob, disabled by default ad interim.

   - Introduce NAPI suspension, to efficiently switching between busy
     polling (NAPI processing suspended) and normal processing.

   - Migrate the IPv4 routing input, output and control path from direct
     ToS usage to DSCP macros. This is a work in progress to make ECN
     handling consistent and reliable.

   - Add drop reasons support to the IPv4 rotue input path, allowing
     better introspection in case of packets drop.

   - Make FIB seqnum lockless, dropping RTNL protection for read access.

   - Make inet{,v6} addresses hashing less predicable.

   - Allow providing timestamp OPT_ID via cmsg, to correlate TX packets
     and timestamps

  Things we sprinkled into general kernel code:

   - Add small file operations for debugfs, to reduce the struct ops
     size.

   - Refactoring and optimization for the implementation of page_frag
     API, This is a preparatory work to consolidate the page_frag
     implementation.

  Netfilter:

   - Optimize set element transactions to reduce memory consumption

   - Extended netlink error reporting for attribute parser failure.

   - Make legacy xtables configs user selectable, giving users the
     option to configure iptables without enabling any other config.

   - Address a lot of false-positive RCU issues, pointed by recent CI
     improvements.

  BPF:

   - Put xsk sockets on a struct diet and add various cleanups. Overall,
     this helps to bump performance by 12% for some workloads.

   - Extend BPF selftests to increase coverage of XDP features in
     combination with BPF cpumap.

   - Optimize and homogenize bpf_csum_diff helper for all archs and also
     add a batch of new BPF selftests for it.

   - Extend netkit with an option to delegate skb->{mark,priority}
     scrubbing to its BPF program.

   - Make the bpf_get_netns_cookie() helper available also to tc(x) BPF
     programs.

  Protocols:

   - Introduces 4-tuple hash for connected udp sockets, speeding-up
     significantly connected sockets lookup.

   - Add a fastpath for some TCP timers that usually expires after
     close, the socket lock contention.

   - Add inbound and outbound xfrm state caches to speed up state
     lookups.

   - Avoid sending MPTCP advertisements on stale subflows, reducing
     risks on loosing them.

   - Make neighbours table flushing more scalable, maintaining per
     device neigh lists.

  Driver API:

   - Introduce a unified interface to configure transmission H/W
     shaping, and expose it to user-space via generic-netlink.

   - Add support for per-NAPI config via netlink. This makes napi
     configuration persistent across queues removal and re-creation.
     Requires driver updates, currently supported drivers are:
     nVidia/Mellanox mlx4 and mlx5, Broadcom brcm and Intel ice.

   - Add ethtool support for writing SFP / PHY firmware blocks.

   - Track RSS context allocation from ethtool core.

   - Implement support for mirroring to DSA CPU port, via TC mirror
     offload.

   - Consolidate FDB updates notification, to avoid duplicates on
     device-specific entries.

   - Expose DPLL clock quality level to the user-space.

   - Support master-slave PHY config via device tree.

  Tests and tooling:

   - forwarding: introduce deferred commands, to simplify the cleanup
     phase

  Drivers:

   - Updated several drivers - Amazon vNic, Google vNic, Microsoft vNic,
     Intel e1000e and Broadcom Tigon3 - to use netdev-genl to link the
     IRQs and queues to NAPI IDs, allowing busy polling and better
     introspection.

   - Ethernet high-speed NICs:
      - nVidia/Mellanox:
         - mlx5:
           - a large refactor to implement support for cross E-Switch
             scheduling
           - refactor H/W conter management to let it scale better
           - H/W GRO cleanups
      - Intel (100G, ice)::
         - add support for ethtool reset
         - implement support for per TX queue H/W shaping
      - AMD/Solarflare:
         - implement per device queue stats support
      - Broadcom (bnxt):
         - improve wildcard l4proto on IPv4/IPv6 ntuple rules
      - Marvell Octeon:
         - Add representor support for each Resource Virtualization Unit
           (RVU) device.
      - Hisilicon:
         - add support for the BMC Gigabit Ethernet
      - IBM (EMAC):
         - driver cleanup and modernization
      - Cisco (VIC):
         - raise the queues number limit to 256

   - Ethernet virtual:
      - Google vNIC:
         - implement page pool support
      - macsec:
         - inherit lower device's features and TSO limits when
           offloading
      - virtio_net:
         - enable premapped mode by default
         - support for XDP socket(AF_XDP) zerocopy TX
      - wireguard:
         - set the TSO max size to be GSO_MAX_SIZE, to aggregate larger
           packets.

   - Ethernet NICs embedded and virtual:
      - Broadcom ASP:
         - enable software timestamping
      - Freescale:
         - add enetc4 PF driver
      - MediaTek: Airoha SoC:
         - implement BQL support
      - RealTek r8169:
         - enable TSO by default on r8168/r8125
         - implement extended ethtool stats
      - Renesas AVB:
         - enable TX checksum offload
      - Synopsys (stmmac):
         - support header splitting for vlan tagged packets
         - move common code for DWMAC4 and DWXGMAC into a separate FPE
           module.
         - add dwmac driver support for T-HEAD TH1520 SoC
      - Synopsys (xpcs):
         - driver refactor and cleanup
      - TI:
         - icssg_prueth: add VLAN offload support
      - Xilinx emaclite:
         - add clock support

   - Ethernet switches:
      - Microchip:
         - implement support for the lan969x Ethernet switch family
         - add LAN9646 switch support to KSZ DSA driver

   - Ethernet PHYs:
      - Marvel: 88q2x: enable auto negotiation
      - Microchip: add support for LAN865X Rev B1 and LAN867X Rev C1/C2

   - PTP:
      - Add support for the Amazon virtual clock device
      - Add PtP driver for s390 clocks

   - WiFi:
      - mac80211
         - EHT 1024 aggregation size for transmissions
         - new operation to indicate that a new interface is to be added
         - support radio separation of multi-band devices
         - move wireless extension spy implementation to libiw
      - Broadcom:
         - brcmfmac: optional LPO clock support
      - Microchip:
         - add support for Atmel WILC3000
      - Qualcomm (ath12k):
         - firmware coredump collection support
         - add debugfs support for a multitude of statistics
      - Qualcomm (ath5k):
         -  Arcadyan ARV45XX AR2417 & Gigaset SX76[23] AR241[34]A support
      - Realtek:
         - rtw88: 8821au and 8812au USB adapters support
         - rtw89: add thermal protection
         - rtw89: fine tune BT-coexsitence to improve user experience
         - rtw89: firmware secure boot for WiFi 6 chip

   - Bluetooth
      - add Qualcomm WCN785x support for ids Foxconn 0xe0fc/0xe0f3 and
        0x13d3:0x3623
      - add Realtek RTL8852BE support for id Foxconn 0xe123
      - add MediaTek MT7920 support for wireless module ids
      - btintel_pcie: add handshake between driver and firmware
      - btintel_pcie: add recovery mechanism
      - btnxpuart: add GPIO support to power save feature"

* tag 'net-next-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1475 commits)
  mm: page_frag: fix a compile error when kernel is not compiled
  Documentation: tipc: fix formatting issue in tipc.rst
  selftests: nic_performance: Add selftest for performance of NIC driver
  selftests: nic_link_layer: Add selftest case for speed and duplex states
  selftests: nic_link_layer: Add link layer selftest for NIC driver
  bnxt_en: Add FW trace coredump segments to the coredump
  bnxt_en: Add a new ethtool -W dump flag
  bnxt_en: Add 2 parameters to bnxt_fill_coredump_seg_hdr()
  bnxt_en: Add functions to copy host context memory
  bnxt_en: Do not free FW log context memory
  bnxt_en: Manage the FW trace context memory
  bnxt_en: Allocate backing store memory for FW trace logs
  bnxt_en: Add a 'force' parameter to bnxt_free_ctx_mem()
  bnxt_en: Refactor bnxt_free_ctx_mem()
  bnxt_en: Add mem_valid bit to struct bnxt_ctx_mem_type
  bnxt_en: Update firmware interface spec to 1.10.3.85
  selftests/bpf: Add some tests with sockmap SK_PASS
  bpf: fix recursive lock when verdict program return SK_PASS
  wireguard: device: support big tcp GSO
  wireguard: selftests: load nf_conntrack if not present
  ...
2024-11-21 08:28:08 -08:00
Alexei Starovoitov
8714381703 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Cross-merge bpf fixes after downstream PR.

In particular to bring the fix in
commit aa30eb3260 ("bpf: Force checkpoint when jmp history is too long").
The follow up verifier work depends on it.
And the fix in
commit 6801cf7890 ("selftests/bpf: Use -4095 as the bad address for bits iterator").
It's fixing instability of BPF CI on s390 arch.

No conflicts.

Adjacent changes in:
Auto-merging arch/Kconfig
Auto-merging kernel/bpf/helpers.c
Auto-merging kernel/bpf/memalloc.c
Auto-merging kernel/bpf/verifier.c
Auto-merging mm/slab_common.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-13 12:52:51 -08:00
Martin Karsten
5dc51ec86d net: Add napi_struct parameter irq_suspend_timeout
Add a per-NAPI IRQ suspension parameter, which can be get/set with
netdev-genl.

This patch doesn't change any behavior but prepares the code for other
changes in the following commits which use irq_suspend_timeout as a
timeout for IRQ suspension.

Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Co-developed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Joe Damato <jdamato@fastly.com>
Tested-by: Martin Karsten <mkarsten@uwaterloo.ca>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Link: https://patch.msgid.link/20241109050245.191288-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11 18:45:05 -08:00
Jiri Olsa
d920179b3d bpf: Add support for uprobe multi session attach
Adding support to attach BPF program for entry and return probe
of the same function. This is common use case which at the moment
requires to create two uprobe multi links.

Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs
kernel to attach single link program to both entry and exit probe.

It's possible to control execution of the BPF program on return
probe simply by returning zero or non zero from the entry BPF
program execution to execute or not the BPF program on return
probe respectively.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20241108134544.480660-4-jolsa@kernel.org
2024-11-11 08:18:03 -08:00
Lorenzo Stoakes
75d60eb30d tools: testing: update tools UAPI header for mman-common.h
Import the new MADV_GUARD_INSTALL/REMOVE madvise flags.

Link: https://lkml.kernel.org/r/ada462fa73fa1defc114242e446ab625b8290b71.1730123433.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Suggested-by: Jann Horn <jannh@google.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vlastimil Babka <vbabkba@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 00:26:45 -08:00
Jakub Kicinski
5b1c965956 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc6).

Conflicts:

drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
  cbe84e9ad5 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
  188a1bf894 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/

net/mac80211/cfg.c
  c4382d5ca1 ("wifi: mac80211: update the right link for tx power")
  8dd0498983 ("wifi: mac80211: Fix setting txpower with emulate_chanctx")

drivers/net/ethernet/intel/ice/ice_ptp_hw.h
  6e58c33106 ("ice: fix crash on probe for DPLL enabled E810 LOM")
  e4291b64e1 ("ice: Align E810T GPIO to other products")
  ebb2693f8f ("ice: Read SDP section from NVM for pin definitions")
  ac532f4f42 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:10:07 -07:00
Linus Torvalds
14b7d43c5c Merge tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Update more header copies with the kernel sources, including const.h,
   msr-index.h, arm64's cputype.h, kvm's, bits.h and unaligned.h

 - The return from 'write' isn't a pid, fix cut'n'paste error in 'perf
   trace'

 - Fix up the python binding build on architectures without
   HAVE_KVM_STAT_SUPPORT

 - Add some more bounds checks to augmented_raw_syscalls.bpf.c (used to
   collect syscall pointer arguments in 'perf trace') to make the
   resulting bytecode to pass the kernel BPF verifier, allowing us to go
   back accepting clang 12.0.1 as the minimum version required for
   compiling BPF sources

 - Add __NR_capget for x86 to fix a regression on running perf + intel
   PT (hw tracing) as non-root setting up the capabilities as described
   in https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html

 - Fix missing syscalltbl in non-explicitly listed architectures,
   noticed on ARM 32-bit, that still needs a .tbl generator for the
   syscall id<->name tables, should be added for v6.13

 - Handle 'perf test' failure when handling broken DWARF for ASM files

* tag 'perf-tools-fixes-for-v6.12-2-2024-10-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf cap: Add __NR_capget to arch/x86 unistd
  tools headers: Update the linux/unaligned.h copy with the kernel sources
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  tools headers: Synchronize {uapi/}linux/bits.h with the kernel sources
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf python: Fix up the build on architectures without HAVE_KVM_STAT_SUPPORT
  perf test: Handle perftool-testsuite_probe failure due to broken DWARF
  tools headers UAPI: Sync kvm headers with the kernel sources
  perf trace: Fix non-listed archs in the syscalltbl routines
  perf build: Change the clang check back to 12.0.1
  perf trace augmented_raw_syscalls: Add more checks to pass the verifier
  perf trace augmented_raw_syscalls: Add extra array index bounds checking to satisfy some BPF verifiers
  perf trace: The return from 'write' isn't a pid
  tools headers UAPI: Sync linux/const.h with the kernel headers
2024-10-30 11:17:47 -10:00