mirror of
https://github.com/armbian/linux-rockchip.git
synced 2026-01-06 11:08:10 -08:00
ce80098db2439ee44403ec6fccd3a10be21c7aff
308 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
8581fd402a |
treewide: Add missing includes masked by cgroup -> bpf dependency
cgroup.h (therefore swap.h, therefore half of the universe) includes bpf.h which in turn includes module.h and slab.h. Since we're about to get rid of that dependency we need to clean things up. v2: drop the cpu.h include from cacheinfo.h, it's not necessary and it makes riscv sensitive to ordering of include files. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Krzysztof Wilczyński <kw@linux.com> Acked-by: Peter Chen <peter.chen@kernel.org> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/all/20211120035253.72074-1-kuba@kernel.org/ # v1 Link: https://lore.kernel.org/all/20211120165528.197359-1-kuba@kernel.org/ # cacheinfo discussion Link: https://lore.kernel.org/bpf/20211202203400.1208663-1-kuba@kernel.org |
||
|
|
79364031c5 |
bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
The initial implementation of migrate_disable() for mainline was a
wrapper around preempt_disable(). RT kernels substituted this with a
real migrate disable implementation.
Later on mainline gained true migrate disable support, but neither
documentation nor affected code were updated.
Remove stale comments claiming that migrate_disable() is PREEMPT_RT only.
Don't use __this_cpu_inc() in the !PREEMPT_RT path because preemption is
not disabled and the RMW operation can be preempted.
Fixes:
|
||
|
|
fc02cb2b37 |
Merge tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Remove socket skb caches
- Add a SO_RESERVE_MEM socket op to forward allocate buffer space and
avoid memory accounting overhead on each message sent
- Introduce managed neighbor entries - added by control plane and
resolved by the kernel for use in acceleration paths (BPF / XDP
right now, HW offload users will benefit as well)
- Make neighbor eviction on link down controllable by userspace to
work around WiFi networks with bad roaming implementations
- vrf: Rework interaction with netfilter/conntrack
- fq_codel: implement L4S style ce_threshold_ect1 marking
- sch: Eliminate unnecessary RCU waits in mini_qdisc_pair_swap()
BPF:
- Add support for new btf kind BTF_KIND_TAG, arbitrary type tagging
as implemented in LLVM14
- Introduce bpf_get_branch_snapshot() to capture Last Branch Records
- Implement variadic trace_printk helper
- Add a new Bloomfilter map type
- Track <8-byte scalar spill and refill
- Access hw timestamp through BPF's __sk_buff
- Disallow unprivileged BPF by default
- Document BPF licensing
Netfilter:
- Introduce egress hook for looking at raw outgoing packets
- Allow matching on and modifying inner headers / payload data
- Add NFT_META_IFTYPE to match on the interface type either from
ingress or egress
Protocols:
- Multi-Path TCP:
- increase default max additional subflows to 2
- rework forward memory allocation
- add getsockopts: MPTCP_INFO, MPTCP_TCPINFO, MPTCP_SUBFLOW_ADDRS
- MCTP flow support allowing lower layer drivers to configure msg
muxing as needed
- Automatic Multicast Tunneling (AMT) driver based on RFC7450
- HSR support the redbox supervision frames (IEC-62439-3:2018)
- Support for the ip6ip6 encapsulation of IOAM
- Netlink interface for CAN-FD's Transmitter Delay Compensation
- Support SMC-Rv2 eliminating the current same-subnet restriction, by
exploiting the UDP encapsulation feature of RoCE adapters
- TLS: add SM4 GCM/CCM crypto support
- Bluetooth: initial support for link quality and audio/codec offload
Driver APIs:
- Add a batched interface for RX buffer allocation in AF_XDP buffer
pool
- ethtool: Add ability to control transceiver modules' power mode
- phy: Introduce supported interfaces bitmap to express MAC
capabilities and simplify PHY code
- Drop rtnl_lock from DSA .port_fdb_{add,del} callbacks
New drivers:
- WiFi driver for Realtek 8852AE 802.11ax devices (rtw89)
- Ethernet driver for ASIX AX88796C SPI device (x88796c)
Drivers:
- Broadcom PHYs
- support 72165, 7712 16nm PHYs
- support IDDQ-SR for additional power savings
- PHY support for QCA8081, QCA9561 PHYs
- NXP DPAA2: support for IRQ coalescing
- NXP Ethernet (enetc): support for software TCP segmentation
- Renesas Ethernet (ravb) - support DMAC and EMAC blocks of
Gigabit-capable IP found on RZ/G2L SoC
- Intel 100G Ethernet
- support for eswitch offload of TC/OvS flow API, including
offload of GRE, VxLAN, Geneve tunneling
- support application device queues - ability to assign Rx and Tx
queues to application threads
- PTP and PPS (pulse-per-second) extensions
- Broadcom Ethernet (bnxt)
- devlink health reporting and device reload extensions
- Mellanox Ethernet (mlx5)
- offload macvlan interfaces
- support HW offload of TC rules involving OVS internal ports
- support HW-GRO and header/data split
- support application device queues
- Marvell OcteonTx2:
- add XDP support for PF
- add PTP support for VF
- Qualcomm Ethernet switch (qca8k): support for QCA8328
- Realtek Ethernet DSA switch (rtl8366rb)
- support bridge offload
- support STP, fast aging, disabling address learning
- support for Realtek RTL8365MB-VC, a 4+1 port 10M/100M/1GE switch
- Mellanox Ethernet/IB switch (mlxsw)
- multi-level qdisc hierarchy offload (e.g. RED, prio and shaping)
- offload root TBF qdisc as port shaper
- support multiple routing interface MAC address prefixes
- support for IP-in-IP with IPv6 underlay
- MediaTek WiFi (mt76)
- mt7921 - ASPM, 6GHz, SDIO and testmode support
- mt7915 - LED and TWT support
- Qualcomm WiFi (ath11k)
- include channel rx and tx time in survey dump statistics
- support for 80P80 and 160 MHz bandwidths
- support channel 2 in 6 GHz band
- spectral scan support for QCN9074
- support for rx decapsulation offload (data frames in 802.3
format)
- Qualcomm phone SoC WiFi (wcn36xx)
- enable Idle Mode Power Save (IMPS) to reduce power consumption
during idle
- Bluetooth driver support for MediaTek MT7922 and MT7921
- Enable support for AOSP Bluetooth extension in Qualcomm WCN399x and
Realtek 8822C/8852A
- Microsoft vNIC driver (mana)
- support hibernation and kexec
- Google vNIC driver (gve)
- support for jumbo frames
- implement Rx page reuse
Refactor:
- Make all writes to netdev->dev_addr go thru helpers, so that we can
add this address to the address rbtree and handle the updates
- Various TCP cleanups and optimizations including improvements to
CPU cache use
- Simplify the gnet_stats, Qdisc stats' handling and remove
qdisc->running sequence counter
- Driver changes and API updates to address devlink locking
deficiencies"
* tag 'net-next-for-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2122 commits)
Revert "net: avoid double accounting for pure zerocopy skbs"
selftests: net: add arp_ndisc_evict_nocarrier
net: ndisc: introduce ndisc_evict_nocarrier sysctl parameter
net: arp: introduce arp_evict_nocarrier sysctl parameter
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
net: vmxnet3: remove multiple false checks in vmxnet3_ethtool.c
net: avoid double accounting for pure zerocopy skbs
tcp: rename sk_wmem_free_skb
netdevsim: fix uninit value in nsim_drv_configure_vfs()
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
...
|
||
|
|
b7b98f8689 |
Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-11-01 We've added 181 non-merge commits during the last 28 day(s) which contain a total of 280 files changed, 11791 insertions(+), 5879 deletions(-). The main changes are: 1) Fix bpf verifier propagation of 64-bit bounds, from Alexei. 2) Parallelize bpf test_progs, from Yucong and Andrii. 3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus. 4) Improve bpf selftests on s390, from Ilya. 5) bloomfilter bpf map type, from Joanne. 6) Big improvements to JIT tests especially on Mips, from Johan. 7) Support kernel module function calls from bpf, from Kumar. 8) Support typeless and weak ksym in light skeleton, from Kumar. 9) Disallow unprivileged bpf by default, from Pawan. 10) BTF_KIND_DECL_TAG support, from Yonghong. 11) Various bpftool cleanups, from Quentin. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits) libbpf: Deprecate AF_XDP support kbuild: Unify options for BTF generation for vmlinux and modules selftests/bpf: Add a testcase for 64-bit bounds propagation issue. bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit. bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off. selftests/bpf: Fix also no-alu32 strobemeta selftest bpf: Add missing map_delete_elem method to bloom filter map selftests/bpf: Add bloom map success test for userspace calls bpf: Add alignment padding for "map_extra" + consolidate holes bpf: Bloom filter map naming fixups selftests/bpf: Add test cases for struct_ops prog bpf: Add dummy BPF STRUCT_OPS for test purpose bpf: Factor out helpers for ctx access checking bpf: Factor out a helper to prepare trampoline for struct_ops prog selftests, bpf: Fix broken riscv build riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h tools, build: Add RISC-V to HOSTARCH parsing riscv, bpf: Increase the maximum number of iterations selftests, bpf: Add one test for sockmap with strparser selftests, bpf: Fix test_txmsg_ingress_parser error ... ==================== Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> |
||
|
|
2dc26d98cf |
Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"The end goal of the current buffer overflow detection work[0] is to
gain full compile-time and run-time coverage of all detectable buffer
overflows seen via array indexing or memcpy(), memmove(), and
memset(). The str*() family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this
series contains the foundational elements of several related buffer
overflow detection improvements by providing new common helpers and
FORTIFY_SOURCE changes needed to gain the introspection required for
compiler visibility into array sizes. Also included are a handful of
already Acked instances using the helpers (or related clean-ups), with
many more waiting at the ready to be taken via subsystem-specific
trees[2].
The new helpers are:
- struct_group() for gaining struct member range introspection
- memset_after() and memset_startat() for clearing to the end of
structures
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
structs
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage
under GCC, and to prepare to fix the currently broken Clang support.
Finishing this work is part of the larger series[0], but depends on
all the false positives and buffer overflow bug fixes to have landed
already and those that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a
compile-time and run-time tests are added for FORTIFY_SOURCE and the
mem*()-family functions respectively. The compile time tests have
found a legitimate (though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage
that result in no known object code differences.
After this series (and the changes that have now landed via netdev and
usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds.
However, due corner cases in GCC[3] and Clang[4], I have not included
the last two patches that turn on these options, as I don't want to
introduce any known warnings to the build. Hopefully these can be
solved soon"
Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]
* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
fortify: strlen: Avoid shadowing previous locals
compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
treewide: Replace 0-element memcpy() destinations with flexible arrays
treewide: Replace open-coded flex arrays in unions
stddef: Introduce DECLARE_FLEX_ARRAY() helper
btrfs: Use memset_startat() to clear end of struct
string.h: Introduce memset_startat() for wiping trailing members and padding
xfrm: Use memset_after() to clear padding
string.h: Introduce memset_after() for wiping trailing members/padding
lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
fortify: Add compile-time FORTIFY_SOURCE tests
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
fortify: Prepare to improve strnlen() and strlen() warnings
fortify: Fix dropped strcpy() compile-time write overflow check
fortify: Explicitly disable Clang support
fortify: Move remaining fortify helpers into fortify-string.h
lib/string: Move helper functions out of string.c
compiler_types.h: Remove __compiletime_object_size()
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
...
|
||
|
|
7df621a3ee |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h |
||
|
|
61a0abaee2 |
bpf: Use u64_stats_t in struct bpf_prog_stats
Commit
|
||
|
|
f941eadd8d |
bpf: Avoid races in __bpf_prog_run() for 32bit arches
__bpf_prog_run() can run from non IRQ contexts, meaning
it could be re entered if interrupted.
This calls for the irq safe variant of u64_stats_update_{begin|end},
or risk a deadlock.
This patch is a nop on 64bit arches, fortunately.
syzbot report:
WARNING: inconsistent lock state
5.12.0-rc3-syzkaller #0 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
udevd/4013 [HC0[0]:SC0[0]:HE1:SE1] takes:
ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: sk_filter include/linux/filter.h:867 [inline]
ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: do_one_broadcast net/netlink/af_netlink.c:1468 [inline]
ff7c9dec (&(&pstats->syncp)->seq){+.?.}-{0:0}, at: netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520
{IN-SOFTIRQ-W} state was registered at:
lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510
lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483
do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline]
do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline]
bpf_prog_run_pin_on_cpu include/linux/filter.h:624 [inline]
bpf_prog_run_clear_cb+0x1bc/0x270 include/linux/filter.h:755
run_filter+0xa0/0x17c net/packet/af_packet.c:2031
packet_rcv+0xc0/0x3e0 net/packet/af_packet.c:2104
dev_queue_xmit_nit+0x2bc/0x39c net/core/dev.c:2387
xmit_one net/core/dev.c:3588 [inline]
dev_hard_start_xmit+0x94/0x518 net/core/dev.c:3609
sch_direct_xmit+0x11c/0x1f0 net/sched/sch_generic.c:313
qdisc_restart net/sched/sch_generic.c:376 [inline]
__qdisc_run+0x194/0x7f8 net/sched/sch_generic.c:384
qdisc_run include/net/pkt_sched.h:136 [inline]
qdisc_run include/net/pkt_sched.h:128 [inline]
__dev_xmit_skb net/core/dev.c:3795 [inline]
__dev_queue_xmit+0x65c/0xf84 net/core/dev.c:4150
dev_queue_xmit+0x14/0x18 net/core/dev.c:4215
neigh_resolve_output net/core/neighbour.c:1491 [inline]
neigh_resolve_output+0x170/0x228 net/core/neighbour.c:1471
neigh_output include/net/neighbour.h:510 [inline]
ip6_finish_output2+0x2e4/0x9fc net/ipv6/ip6_output.c:117
__ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
__ip6_finish_output+0x164/0x3f8 net/ipv6/ip6_output.c:161
ip6_finish_output+0x2c/0xb0 net/ipv6/ip6_output.c:192
NF_HOOK_COND include/linux/netfilter.h:290 [inline]
ip6_output+0x74/0x294 net/ipv6/ip6_output.c:215
dst_output include/net/dst.h:448 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
NF_HOOK include/linux/netfilter.h:295 [inline]
mld_sendpack+0x2a8/0x7e4 net/ipv6/mcast.c:1679
mld_send_cr net/ipv6/mcast.c:1975 [inline]
mld_ifc_timer_expire+0x1e8/0x494 net/ipv6/mcast.c:2474
call_timer_fn+0xd0/0x570 kernel/time/timer.c:1431
expire_timers kernel/time/timer.c:1476 [inline]
__run_timers kernel/time/timer.c:1745 [inline]
run_timer_softirq+0x2e4/0x384 kernel/time/timer.c:1758
__do_softirq+0x204/0x7ac kernel/softirq.c:345
do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
invoke_softirq kernel/softirq.c:228 [inline]
__irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422
irq_exit+0x10/0x3c kernel/softirq.c:446
__handle_domain_irq+0xb4/0x120 kernel/irq/irqdesc.c:692
handle_domain_irq include/linux/irqdesc.h:176 [inline]
gic_handle_irq+0x84/0xac drivers/irqchip/irq-gic.c:370
__irq_svc+0x5c/0x94 arch/arm/kernel/entry-armv.S:205
debug_smp_processor_id+0x0/0x24 lib/smp_processor_id.c:53
rcu_read_lock_held_common kernel/rcu/update.c:108 [inline]
rcu_read_lock_sched_held+0x24/0x7c kernel/rcu/update.c:123
trace_lock_acquire+0x24c/0x278 include/trace/events/lock.h:13
lock_acquire+0x3c/0x74 kernel/locking/lockdep.c:5481
rcu_lock_acquire include/linux/rcupdate.h:267 [inline]
rcu_read_lock include/linux/rcupdate.h:656 [inline]
avc_has_perm_noaudit+0x6c/0x260 security/selinux/avc.c:1150
selinux_inode_permission+0x140/0x220 security/selinux/hooks.c:3141
security_inode_permission+0x44/0x60 security/security.c:1268
inode_permission.part.0+0x5c/0x13c fs/namei.c:521
inode_permission fs/namei.c:494 [inline]
may_lookup fs/namei.c:1652 [inline]
link_path_walk.part.0+0xd4/0x38c fs/namei.c:2208
link_path_walk fs/namei.c:2189 [inline]
path_lookupat+0x3c/0x1b8 fs/namei.c:2419
filename_lookup+0xa8/0x1a4 fs/namei.c:2453
user_path_at_empty+0x74/0x90 fs/namei.c:2733
do_readlinkat+0x5c/0x12c fs/stat.c:417
__do_sys_readlink fs/stat.c:450 [inline]
sys_readlink+0x24/0x28 fs/stat.c:447
ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64
0x7eaa4974
irq event stamp: 298277
hardirqs last enabled at (298277): [<802000d0>] no_work_pending+0x4/0x34
hardirqs last disabled at (298276): [<8020c9b8>] do_work_pending+0x9c/0x648 arch/arm/kernel/signal.c:676
softirqs last enabled at (298216): [<8020167c>] __do_softirq+0x584/0x7ac kernel/softirq.c:372
softirqs last disabled at (298201): [<8024dff4>] do_softirq_own_stack include/asm-generic/softirq_stack.h:10 [inline]
softirqs last disabled at (298201): [<8024dff4>] invoke_softirq kernel/softirq.c:228 [inline]
softirqs last disabled at (298201): [<8024dff4>] __irq_exit_rcu+0x1d8/0x200 kernel/softirq.c:422
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&pstats->syncp)->seq);
<Interrupt>
lock(&(&pstats->syncp)->seq);
*** DEADLOCK ***
1 lock held by udevd/4013:
#0: 82b09c5c (rcu_read_lock){....}-{1:2}, at: sk_filter_trim_cap+0x54/0x434 net/core/filter.c:139
stack backtrace:
CPU: 1 PID: 4013 Comm: udevd Not tainted 5.12.0-rc3-syzkaller #0
Hardware name: ARM-Versatile Express
Backtrace:
[<81802550>] (dump_backtrace) from [<818027c4>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252)
r7:00000080 r6:600d0093 r5:00000000 r4:82b58344
[<818027ac>] (show_stack) from [<81809e98>] (__dump_stack lib/dump_stack.c:79 [inline])
[<818027ac>] (show_stack) from [<81809e98>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120)
[<81809de0>] (dump_stack) from [<81804a00>] (print_usage_bug.part.0+0x228/0x230 kernel/locking/lockdep.c:3806)
r7:86bcb768 r6:81a0326c r5:830f96a8 r4:86bcb0c0
[<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (print_usage_bug kernel/locking/lockdep.c:3776 [inline])
[<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (valid_state kernel/locking/lockdep.c:3818 [inline])
[<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock_irq kernel/locking/lockdep.c:4021 [inline])
[<818047d8>] (print_usage_bug.part.0) from [<802bb1b8>] (mark_lock.part.0+0xc34/0x136c kernel/locking/lockdep.c:4478)
r10:83278fe8 r9:82c6d748 r8:00000000 r7:82c6d2d4 r6:00000004 r5:86bcb768
r4:00000006
[<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_lock kernel/locking/lockdep.c:4442 [inline])
[<802ba584>] (mark_lock.part.0) from [<802bc644>] (mark_usage kernel/locking/lockdep.c:4391 [inline])
[<802ba584>] (mark_lock.part.0) from [<802bc644>] (__lock_acquire+0x9bc/0x3318 kernel/locking/lockdep.c:4854)
r10:86bcb768 r9:86bcb0c0 r8:00000001 r7:00040000 r6:0000075a r5:830f96a8
r4:00000000
[<802bbc88>] (__lock_acquire) from [<802bfb90>] (lock_acquire.part.0+0xf0/0x41c kernel/locking/lockdep.c:5510)
r10:00000000 r9:600d0013 r8:00000000 r7:00000000 r6:828a2680 r5:828a2680
r4:861e5bc8
[<802bfaa0>] (lock_acquire.part.0) from [<802bff28>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5483)
r10:8146137c r9:00000000 r8:00000001 r7:00000000 r6:00000000 r5:00000000
r4:ff7c9dec
[<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin_nested include/linux/seqlock.h:520 [inline])
[<802bfebc>] (lock_acquire) from [<81381eb4>] (do_write_seqcount_begin include/linux/seqlock.h:545 [inline])
[<802bfebc>] (lock_acquire) from [<81381eb4>] (u64_stats_update_begin include/linux/u64_stats_sync.h:129 [inline])
[<802bfebc>] (lock_acquire) from [<81381eb4>] (__bpf_prog_run_save_cb include/linux/filter.h:727 [inline])
[<802bfebc>] (lock_acquire) from [<81381eb4>] (bpf_prog_run_save_cb include/linux/filter.h:741 [inline])
[<802bfebc>] (lock_acquire) from [<81381eb4>] (sk_filter_trim_cap+0x26c/0x434 net/core/filter.c:149)
r10:a4095dd0 r9:ff7c9dd0 r8:e44be000 r7:8146137c r6:00000001 r5:8611ba80
r4:00000000
[<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (sk_filter include/linux/filter.h:867 [inline])
[<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (do_one_broadcast net/netlink/af_netlink.c:1468 [inline])
[<81381c48>] (sk_filter_trim_cap) from [<8146137c>] (netlink_broadcast_filtered+0x27c/0x4fc net/netlink/af_netlink.c:1520)
r10:00000001 r9:833d6b1c r8:00000000 r7:8572f864 r6:8611ba80 r5:8698d800
r4:8572f800
[<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_broadcast net/netlink/af_netlink.c:1544 [inline])
[<81461100>] (netlink_broadcast_filtered) from [<81463e60>] (netlink_sendmsg+0x3d0/0x478 net/netlink/af_netlink.c:1925)
r10:00000000 r9:00000002 r8:8698d800 r7:000000b7 r6:8611b900 r5:861e5f50
r4:86aa3000
[<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg_nosec net/socket.c:654 [inline])
[<81463a90>] (netlink_sendmsg) from [<81321f54>] (sock_sendmsg+0x3c/0x4c net/socket.c:674)
r10:00000000 r9:861e5dd4 r8:00000000 r7:86570000 r6:00000000 r5:86570000
r4:861e5f50
[<81321f18>] (sock_sendmsg) from [<813234d0>] (____sys_sendmsg+0x230/0x29c net/socket.c:2350)
r5:00000040 r4:861e5f50
[<813232a0>] (____sys_sendmsg) from [<8132549c>] (___sys_sendmsg+0xac/0xe4 net/socket.c:2404)
r10:00000128 r9:861e4000 r8:00000000 r7:00000000 r6:86570000 r5:861e5f50
r4:00000000
[<813253f0>] (___sys_sendmsg) from [<81325684>] (__sys_sendmsg net/socket.c:2433 [inline])
[<813253f0>] (___sys_sendmsg) from [<81325684>] (__do_sys_sendmsg net/socket.c:2442 [inline])
[<813253f0>] (___sys_sendmsg) from [<81325684>] (sys_sendmsg+0x58/0xa0 net/socket.c:2440)
r8:80200224 r7:00000128 r6:00000000 r5:7eaa541c r4:86570000
[<8132562c>] (sys_sendmsg) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64)
Exception stack(0x861e5fa8 to 0x861e5ff0)
5fa0: 00000000 00000000 0000000c 7eaa541c 00000000 00000000
5fc0: 00000000 00000000 76fbf840 00000128 00000000
|
||
|
|
fadb7ff1a6 |
bpf: Prevent increasing bpf_jit_limit above max
Restrict bpf_jit_limit to the maximum supported by the arch's JIT. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20211014142554.53120-4-lmb@cloudflare.com |
||
|
|
fa7845cfd5 |
treewide: Replace open-coded flex arrays in unions
In support of enabling -Warray-bounds and -Wzero-length-bounds and
correctly handling run-time memcpy() bounds checking, replace all
open-coded flexible arrays (i.e. 0-element arrays) in unions with the
DECLARE_FLEX_ARRAY() helper macro.
This fixes warnings such as:
fs/hpfs/anode.c: In function 'hpfs_add_sector_to_btree':
fs/hpfs/anode.c:209:27: warning: array subscript 0 is outside the bounds of an interior zero-length array 'struct bplus_internal_node[0]' [-Wzero-length-bounds]
209 | anode->btree.u.internal[0].down = cpu_to_le32(a);
| ~~~~~~~~~~~~~~~~~~~~~~~^~~
In file included from fs/hpfs/hpfs_fn.h:26,
from fs/hpfs/anode.c:10:
fs/hpfs/hpfs.h:412:32: note: while referencing 'internal'
412 | struct bplus_internal_node internal[0]; /* (internal) 2-word entries giving
| ^~~~~~~~
drivers/net/can/usb/etas_es58x/es58x_fd.c: In function 'es58x_fd_tx_can_msg':
drivers/net/can/usb/etas_es58x/es58x_fd.c:360:35: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[]'} [-Wzero-length-bounds]
360 | tx_can_msg = (typeof(tx_can_msg))&es58x_fd_urb_cmd->raw_msg[msg_len];
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from drivers/net/can/usb/etas_es58x/es58x_core.h:22,
from drivers/net/can/usb/etas_es58x/es58x_fd.c:17:
drivers/net/can/usb/etas_es58x/es58x_fd.h:231:6: note: while referencing 'raw_msg'
231 | u8 raw_msg[0];
| ^~~~~~~
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Cc: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Cc: Rohit Maheshwari <rohitm@chelsio.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Stanislaw Gruszka <stf_xl@wp.pl>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Mordechay Goodstein <mordechay.goodstein@intel.com>
Cc: Lee Jones <lee.jones@linaro.org>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: Arunachalam Santhanam <arunachalam.santhanam@in.bosch.com>
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: linux-crypto@vger.kernel.org
Cc: ath10k@lists.infradead.org
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-can@vger.kernel.org
Cc: bpf@vger.kernel.org
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # drivers/net/can/usb/etas_es58x/*
Signed-off-by: Kees Cook <keescook@chromium.org>
|
||
|
|
102acbacfd |
bpf: Replace callers of BPF_CAST_CALL with proper function typedef
In order to keep ahead of cases in the kernel where Control Flow Integrity (CFI) may trip over function call casts, enabling -Wcast-function-type is helpful. To that end, BPF_CAST_CALL causes various warnings and is one of the last places in the kernel triggering this warning. For actual function calls, replace BPF_CAST_CALL() with a typedef, which captures the same details about the given function pointers. This change results in no object code difference. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://github.com/KSPP/linux/issues/20 Link: https://lore.kernel.org/lkml/CAEf4Bzb46=-J5Fxc3mMZ8JQPtK1uoE0q6+g6WPz53Cvx=CBEhw@mail.gmail.com Link: https://lore.kernel.org/bpf/20210928230946.4062144-3-keescook@chromium.org |
||
|
|
3d717fad50 |
bpf: Replace "want address" users of BPF_CAST_CALL with BPF_CALL_IMM
In order to keep ahead of cases in the kernel where Control Flow Integrity (CFI) may trip over function call casts, enabling -Wcast-function-type is helpful. To that end, BPF_CAST_CALL causes various warnings and is one of the last places in the kernel triggering this warning. Most places using BPF_CAST_CALL actually just want a void * to perform math on. It's not actually performing a call, so just use a different helper to get the void *, by way of the new BPF_CALL_IMM() helper, which can clean up a common copy/paste idiom as well. This change results in no object code difference. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://github.com/KSPP/linux/issues/20 Link: https://lore.kernel.org/lkml/CAEf4Bzb46=-J5Fxc3mMZ8JQPtK1uoE0q6+g6WPz53Cvx=CBEhw@mail.gmail.com Link: https://lore.kernel.org/bpf/20210928230946.4062144-2-keescook@chromium.org |
||
|
|
b250e6d141 |
Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada: - Add -s option (strict mode) to merge_config.sh to make it fail when any symbol is redefined. - Show a warning if a different compiler is used for building external modules. - Infer --target from ARCH for CC=clang to let you cross-compile the kernel without CROSS_COMPILE. - Make the integrated assembler default (LLVM_IAS=1) for CC=clang. - Add <linux/stdarg.h> to the kernel source instead of borrowing <stdarg.h> from the compiler. - Add Nick Desaulniers as a Kbuild reviewer. - Drop stale cc-option tests. - Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG to handle symbols in inline assembly. - Show a warning if 'FORCE' is missing for if_changed rules. - Various cleanups * tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits) kbuild: redo fake deps at include/ksym/*.h kbuild: clean up objtool_args slightly modpost: get the *.mod file path more simply checkkconfigsymbols.py: Fix the '--ignore' option kbuild: merge vmlinux_link() between ARCH=um and other architectures kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh kbuild: merge vmlinux_link() between the ordinary link and Clang LTO kbuild: remove stale *.symversions kbuild: remove unused quiet_cmd_update_lto_symversions gen_compile_commands: extract compiler command from a series of commands x86: remove cc-option-yn test for -mtune= arc: replace cc-option-yn uses with cc-option s390: replace cc-option-yn uses with cc-option ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild sparc: move the install rule to arch/sparc/Makefile security: remove unneeded subdir-$(CONFIG_...) kbuild: sh: remove unused install script kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y kbuild: Switch to 'f' variants of integrated assembler flag kbuild: Shuffle blank line to improve comment meaning ... |
||
|
|
39f75da7bc |
isystem: trim/fixup stdarg.h and other headers
Delete/fixup few includes in anticipation of global -isystem compile option removal. Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition of uintptr_t error (one definition comes from <stddef.h>, another from <linux/types.h>). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
|
|
7d08c2c911 |
bpf: Refactor BPF_PROG_RUN_ARRAY family of macros into functions
Similar to BPF_PROG_RUN, turn BPF_PROG_RUN_ARRAY macros into proper functions with all the same readability and maintainability benefits. Making them into functions required shuffling around bpf_set_run_ctx/bpf_reset_run_ctx functions. Also, explicitly specifying the type of the BPF prog run callback required adjusting __bpf_prog_run_save_cb() to accept const void *, casted internally to const struct sk_buff. Further, split out a cgroup-specific BPF_PROG_RUN_ARRAY_CG and BPF_PROG_RUN_ARRAY_CG_FLAGS from the more generic BPF_PROG_RUN_ARRAY due to the differences in bpf_run_ctx used for those two different use cases. I think BPF_PROG_RUN_ARRAY_CG would benefit from further refactoring to accept struct cgroup and enum bpf_attach_type instead of bpf_prog_array, fetching cgrp->bpf.effective[type] and RCU-dereferencing it internally. But that required including include/linux/cgroup-defs.h, which I wasn't sure is ok with everyone. The remaining generic BPF_PROG_RUN_ARRAY function will be extended to pass-through user-provided context value in the next patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210815070609.987780-3-andrii@kernel.org |
||
|
|
fb7dd8bca0 |
bpf: Refactor BPF_PROG_RUN into a function
Turn BPF_PROG_RUN into a proper always inlined function. No functional and performance changes are intended, but it makes it much easier to understand what's going on with how BPF programs are actually get executed. It's more obvious what types and callbacks are expected. Also extra () around input parameters can be dropped, as well as `__` variable prefixes intended to avoid naming collisions, which makes the code simpler to read and write. This refactoring also highlighted one extra issue. BPF_PROG_RUN is both a macro and an enum value (BPF_PROG_RUN == BPF_PROG_TEST_RUN). Turning BPF_PROG_RUN into a function causes naming conflict compilation error. So rename BPF_PROG_RUN into lower-case bpf_prog_run(), similar to bpf_prog_run_xdp(), bpf_prog_run_pin_on_cpu(), etc. All existing callers of BPF_PROG_RUN, the macro, are switched to bpf_prog_run() explicitly. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210815070609.987780-2-andrii@kernel.org |
||
|
|
879af96ffd |
net, core: Add support for XDP redirection to slave device
This adds the ndo_xdp_get_xmit_slave hook for transforming XDP_TX into XDP_REDIRECT after BPF program run when the ingress device is a bond slave. The dev_xdp_prog_count is exposed so that slave devices can be checked for loaded XDP programs in order to avoid the situation where both bond master and slave have programs loaded according to xdp_state. Signed-off-by: Jussi Maki <joamaki@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Link: https://lore.kernel.org/bpf/20210731055738.16820-3-joamaki@gmail.com |
||
|
|
d2e11fd2b7 |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicting commits, all resolutions pretty trivial: drivers/bus/mhi/pci_generic.c |
||
|
|
f5e81d1117 |
bpf: Introduce BPF nospec instruction for mitigating Spectre v4
In case of JITs, each of the JIT backends compiles the BPF nospec instruction /either/ to a machine instruction which emits a speculation barrier /or/ to /no/ machine instruction in case the underlying architecture is not affected by Speculative Store Bypass or has different mitigations in place already. This covers both x86 and (implicitly) arm64: In case of x86, we use 'lfence' instruction for mitigation. In case of arm64, we rely on the firmware mitigation as controlled via the ssbd kernel parameter. Whenever the mitigation is enabled, it works for all of the kernel code with no need to provide any additional instructions here (hence only comment in arm64 JIT). Other archs can follow as needed. The BPF nospec instruction is specifically targeting Spectre v4 since i) we don't use a serialization barrier for the Spectre v1 case, and ii) mitigation instructions for v1 and v4 might be different on some archs. The BPF nospec is required for a future commit, where the BPF verifier does annotate intermediate BPF programs with speculation barriers. Co-developed-by: Piotr Krysiuk <piotras@gmail.com> Co-developed-by: Benedict Schlueter <benedict.schlueter@rub.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Piotr Krysiuk <piotras@gmail.com> Signed-off-by: Benedict Schlueter <benedict.schlueter@rub.de> Acked-by: Alexei Starovoitov <ast@kernel.org> |
||
|
|
1e37392ccc |
bpf: Enable BPF_TRAMP_F_IP_ARG for trampolines with call_get_func_ip
Enabling BPF_TRAMP_F_IP_ARG for trampolines that actually need it. The BPF_TRAMP_F_IP_ARG adds extra 3 instructions to trampoline code and is used only by programs with bpf_get_func_ip helper, which is added in following patch and sets call_get_func_ip bit. This patch ensures that BPF_TRAMP_F_IP_ARG flag is used only for trampolines that have programs with call_get_func_ip set. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210714094400.396467-3-jolsa@kernel.org |
||
|
|
782347b6bc |
xdp: Add proper __rcu annotations to redirect map entries
XDP_REDIRECT works by a three-step process: the bpf_redirect() and bpf_redirect_map() helpers will lookup the target of the redirect and store it (along with some other metadata) in a per-CPU struct bpf_redirect_info. Next, when the program returns the XDP_REDIRECT return code, the driver will call xdp_do_redirect() which will use the information thus stored to actually enqueue the frame into a bulk queue structure (that differs slightly by map type, but shares the same principle). Finally, before exiting its NAPI poll loop, the driver will call xdp_do_flush(), which will flush all the different bulk queues, thus completing the redirect. Pointers to the map entries will be kept around for this whole sequence of steps, protected by RCU. However, there is no top-level rcu_read_lock() in the core code; instead drivers add their own rcu_read_lock() around the XDP portions of the code, but somewhat inconsistently as Martin discovered[0]. However, things still work because everything happens inside a single NAPI poll sequence, which means it's between a pair of calls to local_bh_disable()/local_bh_enable(). So Paul suggested[1] that we could document this intention by using rcu_dereference_check() with rcu_read_lock_bh_held() as a second parameter, thus allowing sparse and lockdep to verify that everything is done correctly. This patch does just that: we add an __rcu annotation to the map entry pointers and remove the various comments explaining the NAPI poll assurance strewn through devmap.c in favour of a longer explanation in filter.c. The goal is to have one coherent documentation of the entire flow, and rely on the RCU annotations as a "standard" way of communicating the flow in the map code (which can additionally be understood by sparse and lockdep). The RCU annotation replacements result in a fairly straight-forward replacement where READ_ONCE() becomes rcu_dereference_check(), WRITE_ONCE() becomes rcu_assign_pointer() and xchg() and cmpxchg() gets wrapped in the proper constructs to cast the pointer back and forth between __rcu and __kernel address space (for the benefit of sparse). The one complication is that xskmap has a few constructions where double-pointers are passed back and forth; these simply all gain __rcu annotations, and only the final reference/dereference to the inner-most pointer gets changed. With this, everything can be run through sparse without eliciting complaints, and lockdep can verify correctness even without the use of rcu_read_lock() in the drivers. Subsequent patches will clean these up from the drivers. [0] https://lore.kernel.org/bpf/20210415173551.7ma4slcbqeyiba2r@kafai-mbp.dhcp.thefacebook.com/ [1] https://lore.kernel.org/bpf/20210419165837.GA975577@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210624160609.292325-6-toke@redhat.com |
||
|
|
d5e4ddaeb6 |
bpf: Support socket migration by eBPF.
This patch introduces a new bpf_attach_type for BPF_PROG_TYPE_SK_REUSEPORT
to check if the attached eBPF program is capable of migrating sockets. When
the eBPF program is attached, we run it for socket migration if the
expected_attach_type is BPF_SK_REUSEPORT_SELECT_OR_MIGRATE or
net.ipv4.tcp_migrate_req is enabled.
Currently, the expected_attach_type is not enforced for the
BPF_PROG_TYPE_SK_REUSEPORT type of program. Thus, this commit follows the
earlier idea in the commit
|
||
|
|
e624d4ed4a |
xdp: Extend xdp_redirect_map with broadcast support
This patch adds two flags BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS to
extend xdp_redirect_map for broadcast support.
With BPF_F_BROADCAST the packet will be broadcasted to all the interfaces
in the map. with BPF_F_EXCLUDE_INGRESS the ingress interface will be
excluded when do broadcasting.
When getting the devices in dev hash map via dev_map_hash_get_next_key(),
there is a possibility that we fall back to the first key when a device
was removed. This will duplicate packets on some interfaces. So just walk
the whole buckets to avoid this issue. For dev array map, we also walk the
whole map to find valid interfaces.
Function bpf_clear_redirect_map() was removed in
commit
|
||
|
|
913d550376 |
bpf: Remove unused bpf_load_pointer
Remove unused bpf_load_pointer function in filter.h. The last user of it has
been removed with
|
||
|
|
e6ac2450d6 |
bpf: Support bpf program calling kernel function
This patch adds support to BPF verifier to allow bpf program calling kernel function directly. The use case included in this set is to allow bpf-tcp-cc to directly call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()"). Those functions have already been used by some kernel tcp-cc implementations. This set will also allow the bpf-tcp-cc program to directly call the kernel tcp-cc implementation, For example, a bpf_dctcp may only want to implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly from the kernel tcp_dctcp.c instead of reimplementing (or copy-and-pasting) them. The tcp-cc kernel functions mentioned above will be white listed for the struct_ops bpf-tcp-cc programs to use in a later patch. The white listed functions are not bounded to a fixed ABI contract. Those functions have already been used by the existing kernel tcp-cc. If any of them has changed, both in-tree and out-of-tree kernel tcp-cc implementations have to be changed. The same goes for the struct_ops bpf-tcp-cc programs which have to be adjusted accordingly. This patch is to make the required changes in the bpf verifier. First change is in btf.c, it adds a case in "btf_check_func_arg_match()". When the passed in "btf->kernel_btf == true", it means matching the verifier regs' states with a kernel function. This will handle the PTR_TO_BTF_ID reg. It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET, and PTR_TO_TCP_SOCK to its kernel's btf_id. In the later libbpf patch, the insn calling a kernel function will look like: insn->code == (BPF_JMP | BPF_CALL) insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */ insn->imm == func_btf_id /* btf_id of the running kernel */ [ For the future calling function-in-kernel-module support, an array of module btf_fds can be passed at the load time and insn->off can be used to index into this array. ] At the early stage of verifier, the verifier will collect all kernel function calls into "struct bpf_kfunc_desc". Those descriptors are stored in "prog->aux->kfunc_tab" and will be available to the JIT. Since this "add" operation is similar to the current "add_subprog()" and looking for the same insn->code, they are done together in the new "add_subprog_and_kfunc()". In the "do_check()" stage, the new "check_kfunc_call()" is added to verify the kernel function call instruction: 1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE. A new bpf_verifier_ops "check_kfunc_call" is added to do that. The bpf-tcp-cc struct_ops program will implement this function in a later patch. 2. Call "btf_check_kfunc_args_match()" to ensure the regs can be used as the args of a kernel function. 3. Mark the regs' type, subreg_def, and zext_dst. At the later do_misc_fixups() stage, the new fixup_kfunc_call() will replace the insn->imm with the function address (relative to __bpf_call_base). If needed, the jit can find the btf_func_model by calling the new bpf_jit_find_kfunc_model(prog, insn). With the imm set to the function address, "bpftool prog dump xlated" will be able to display the kernel function calls the same way as it displays other bpf helper calls. gpl_compatible program is required to call kernel function. This feature currently requires JIT. The verifier selftests are adjusted because of the changes in the verbose log in add_subprog_and_kfunc(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com |