Pull networking fixes from Paolo Abeni:
"Notably this includes fixes for a few regressions spotted very
recently. No known outstanding ones.
Current release - regressions:
- core: avoid CFI problems with sock priv helpers
- xsk: bring back busy polling support
- netpoll: ensure skb_pool list is always initialized
Current release - new code bugs:
- core: make page_pool_ref_netmem work with net iovs
- ipv4: route: fix drop reason being overridden in
ip_route_input_slow
- udp: make rehash4 independent in udp_lib_rehash()
Previous releases - regressions:
- bpf: fix bpf_sk_select_reuseport() memory leak
- openvswitch: fix lockup on tx to unregistering netdev with carrier
- mptcp: be sure to send ack when mptcp-level window re-opens
- eth:
- bnxt: always recalculate features after XDP clearing, fix
null-deref
- mlx5: fix sub-function add port error handling
- fec: handle page_pool_dev_alloc_pages error
Previous releases - always broken:
- vsock: some fixes due to transport de-assignment
- eth:
- ice: fix E825 initialization
- mlx5e: fix inversion dependency warning while enabling IPsec
tunnel
- gtp: destroy device along with udp socket's netns dismantle.
- xilinx: axienet: Fix IRQ coalescing packet count overflow"
* tag 'net-6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
netdev: avoid CFI problems with sock priv helpers
net/mlx5e: Always start IPsec sequence number from 1
net/mlx5e: Rely on reqid in IPsec tunnel mode
net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel
net/mlx5: Clear port select structure when fail to create
net/mlx5: SF, Fix add port error handling
net/mlx5: Fix a lockdep warning as part of the write combining test
net/mlx5: Fix RDMA TX steering prio
net: make page_pool_ref_netmem work with net iovs
net: ethernet: xgbe: re-add aneg to supported features in PHY quirks
net: pcs: xpcs: actively unset DW_VR_MII_DIG_CTRL1_2G5_EN for 1G SGMII
net: pcs: xpcs: fix DW_VR_MII_DIG_CTRL1_2G5_EN bit being set for 1G SGMII w/o inband
selftests: net: Adapt ethtool mq tests to fix in qdisc graft
net: fec: handle page_pool_dev_alloc_pages error
net: netpoll: ensure skb_pool list is always initialized
net: xilinx: axienet: Fix IRQ coalescing packet count overflow
nfp: bpf: prevent integer overflow in nfp_bpf_event_output()
selftests: mptcp: avoid spurious errors on disconnect
mptcp: fix spurious wake-up on under memory pressure
mptcp: be sure to send ack when mptcp-level window re-opens
...
Because of patch[1] the graft behaviour changed
So the command:
tcq replace parent 100:1 handle 204:
Is no longer valid and will not delete 100:4 added by command:
tcq replace parent 100:4 handle 204: pfifo_fast
So to maintain the original behaviour, this patch manually deletes 100:4
and grafts 100:1
Note: This change will also work fine without [1]
[1] https://lore.kernel.org/netdev/20250111151455.75480-1-jhs@mojatatu.com/T/#u
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The disconnect test-case generates spurious errors:
INFO: disconnect
INFO: extra options: -I 3 -i /tmp/tmp.r43niviyoI
01 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 140ms) [FAIL]
file received by server does not match (in, out):
Unexpected revents: POLLERR/POLLNVAL(19)
-rw-r--r-- 1 root root 10028676 Jan 10 10:47 /tmp/tmp.r43niviyoI.disconnect
Trailing bytes are:
��\����R���!8��u2��5N%
-rw------- 1 root root 9992290 Jan 10 10:47 /tmp/tmp.Os4UbnWbI1
Trailing bytes are:
��\����R���!8��u2��5N%
02 ns1 MPTCP -> ns1 (dead:beef:1::1:10001) MPTCP (duration 206ms) [ OK ]
03 ns1 MPTCP -> ns1 (dead:beef:1::1:10002) TCP (duration 31ms) [ OK ]
04 ns1 TCP -> ns1 (dead:beef:1::1:10003) MPTCP (duration 26ms) [ OK ]
[FAIL] Tests of the full disconnection have failed
Time: 2 seconds
The root cause is actually in the user-space bits: the test program
currently disconnects as soon as all the pending data has been spooled,
generating an FASTCLOSE. If such option reaches the peer before the
latter has reached the closed status, the msk socket will report an
error to the user-space, as per protocol specification, causing the
above failure.
Address the issue explicitly waiting for all the relevant sockets to
reach a closed status before performing the disconnect.
Fixes: 05be5e273c ("selftests: mptcp: add disconnect tests")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-3-0d986ee7b1b6@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull misc fixes from Andrew Morton:
"18 hotfixes. 11 are cc:stable. 13 are MM and 5 are non-MM.
All patches are singletons - please see the relevant changelogs for
details"
* tag 'mm-hotfixes-stable-2025-01-13-00-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
fs/proc: fix softlockup in __read_vmcore (part 2)
mm: fix assertion in folio_end_read()
mm: vmscan : pgdemote vmstat is not getting updated when MGLRU is enabled.
vmstat: disable vmstat_work on vmstat_cpu_down_prep()
zram: fix potential UAF of zram table
selftests/mm: set allocated memory to non-zero content in cow test
mm: clear uffd-wp PTE/PMD state on mremap()
module: fix writing of livepatch relocations in ROX text
mm: zswap: properly synchronize freeing resources during CPU hotunplug
Revert "mm: zswap: fix race between [de]compression and CPU hotunplug"
hugetlb: fix NULL pointer dereference in trace_hugetlbfs_alloc_inode
mm: fix div by zero in bdi_ratio_from_pages
x86/execmem: fix ROX cache usage in Xen PV guests
filemap: avoid truncating 64-bit offset to 32 bits
tools: fix atomic_set() definition to set the value correctly
mm/mempolicy: count MPOL_WEIGHTED_INTERLEAVE to "interleave_hit"
scripts/decode_stacktrace.sh: fix decoding of lines with an additional info
mm/kmemleak: fix percpu memory leak detection failure
After commit b1f202060a ("mm: remap unused subpages to shared zeropage
when splitting isolated thp"), cow test cases involving swapping out THPs
via madvise(MADV_PAGEOUT) started to be skipped due to the subsequent
check via pagemap determining that the memory was not actually swapped
out. Logs similar to this were emitted:
...
# [RUN] Basic COW after fork() ... with swapped-out, PTE-mapped THP (16 kB)
ok 2 # SKIP MADV_PAGEOUT did not work, is swap enabled?
# [RUN] Basic COW after fork() ... with single PTE of swapped-out THP (16 kB)
ok 3 # SKIP MADV_PAGEOUT did not work, is swap enabled?
# [RUN] Basic COW after fork() ... with swapped-out, PTE-mapped THP (32 kB)
ok 4 # SKIP MADV_PAGEOUT did not work, is swap enabled?
...
The commit in question introduces the behaviour of scanning THPs and if
their content is predominantly zero, it splits them and replaces the pages
which are wholly zero with the zero page. These cow test cases were
getting caught up in this.
So let's avoid that by filling the contents of all allocated memory with
a non-zero value. With this in place, the tests are passing again.
Link: https://lkml.kernel.org/r/20250107142555.1870101-1-ryan.roberts@arm.com
Fixes: b1f202060a ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull kvm fixes from Paolo Bonzini:
"The largest part here is for KVM/PPC, where a NULL pointer dereference
was introduced in the 6.13 merge window and is now fixed.
There's some "holiday-induced lateness", as the s390 submaintainer put
it, but otherwise things looks fine.
s390:
- fix a latent bug when the kernel is compiled in debug mode
- two small UCONTROL fixes and their selftests
arm64:
- always check page state in hyp_ack_unshare()
- align set_id_regs selftest with the fact that ASIDBITS field is RO
- various vPMU fixes for bugs that only affect nested virt
PPC e500:
- Fix a mostly impossible (but just wrong) case where IRQs were never
re-enabled
- Observe host permissions instead of mapping readonly host pages as
guest-writable. This fixes a NULL-pointer dereference in 6.13
- Replace brittle VMA-based attempts at building huge shadow TLB
entries with PTE lookups"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: e500: perform hugepage check after looking up the PFN
KVM: e500: map readonly host pages for read
KVM: e500: track host-writability of pages
KVM: e500: use shadow TLB entry as witness for writability
KVM: e500: always restore irqs
KVM: s390: selftests: Add has device attr check to uc_attr_mem_limit selftest
KVM: s390: selftests: Add ucontrol gis routing test
KVM: s390: Reject KVM_SET_GSI_ROUTING on ucontrol VMs
KVM: s390: selftests: Add ucontrol flic attr selftests
KVM: s390: Reject setting flic pfault attributes on ucontrol VMs
KVM: s390: vsie: fix virtual/physical address in unpin_scb()
KVM: arm64: Only apply PMCR_EL0.P to the guest range of counters
KVM: arm64: nv: Reload PMU events upon MDCR_EL2.HPME change
KVM: arm64: Use KVM_REQ_RELOAD_PMU to handle PMCR_EL0.E change
KVM: arm64: Add unified helper for reprogramming counters by mask
KVM: arm64: Always check the state from hyp_ack_unshare()
KVM: arm64: Fix set_id_regs selftest for ASIDBITS becoming unwritable
KVM/arm64 changes for 6.13, part #3
- Always check page state in hyp_ack_unshare()
- Align set_id_regs selftest with the fact that ASIDBITS field is RO
- Various vPMU fixes for bugs that only affect nested virt
Pull cgroup fixes from Tejun Heo:
"Cpuset fixes:
- Fix isolated CPUs leaking into sched domains
- Remove now unnecessary kernfs active break which can trigger a
warning
- Comment updates"
* tag 'cgroup-for-6.13-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/cpuset: remove kernfs active break
cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains
cgroup/cpuset: Remove stale text
Pull RISC-V fixes from Palmer Dabbelt:
- a handful of selftest fixes
- fix a memory leak in relocation processing during module loading
- avoid sleeping in die()
- fix kprobe instruction slot address calculations
- fix DT node reference leak in SBI idle probing
- avoid initializing out of bounds pages on sparse vmemmap systems with
a gap at the start of their physical memory map
- fix backtracing through exceptions
- _Q_PENDING_LOOPS is now defined whenever QUEUED_SPINLOCKS=y
- local labels in entry.S are now marked with ".L", which prevents them
from trashing backtraces
- a handful of fixes for SBI-based performance counters
* tag 'riscv-for-linus-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
drivers/perf: riscv: Do not allow invalid raw event config
drivers/perf: riscv: Return error for default case
drivers/perf: riscv: Fix Platform firmware event data
tools: selftests: riscv: Add test count for vstate_prctl
tools: selftests: riscv: Add pass message for v_initval_nolibc
riscv: use local label names instead of global ones in assembly
riscv: qspinlock: Fixup _Q_PENDING_LOOPS definition
riscv: stacktrace: fix backtracing through exceptions
riscv: mm: Fix the out of bound issue of vmemmap address
cpuidle: riscv-sbi: fix device node release in early exit of for_each_possible_cpu
riscv: kprobes: Fix incorrect address calculation
riscv: Fix sleeping in invalid context in die()
riscv: module: remove relocation_head rel_entry member allocation
riscv: selftests: Fix warnings pointer masking test
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, Bluetooth and WPAN.
No outstanding fixes / investigations at this time.
Current release - new code bugs:
- eth: fbnic: revert HWMON support, it doesn't work at all and revert
is similar size as the fixes
Previous releases - regressions:
- tcp: allow a connection when sk_max_ack_backlog is zero
- tls: fix tls_sw_sendmsg error handling
Previous releases - always broken:
- netdev netlink family:
- prevent accessing NAPI instances from another namespace
- don't dump Tx and uninitialized NAPIs
- net: sysctl: avoid using current->nsproxy, fix null-deref if task
is exiting and stick to opener's netns
- sched: sch_cake: add bounds checks to host bulk flow fairness
counts
Misc:
- annual cleanup of inactive maintainers"
* tag 'net-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
rds: sysctl: rds_tcp_{rcv,snd}buf: avoid using current->nsproxy
sctp: sysctl: plpmtud_probe_interval: avoid using current->nsproxy
sctp: sysctl: udp_port: avoid using current->nsproxy
sctp: sysctl: auth_enable: avoid using current->nsproxy
sctp: sysctl: rto_min/max: avoid using current->nsproxy
sctp: sysctl: cookie_hmac_alg: avoid using current->nsproxy
mptcp: sysctl: blackhole timeout: avoid using current->nsproxy
mptcp: sysctl: sched: avoid using current->nsproxy
mptcp: sysctl: avail sched: remove write access
MAINTAINERS: remove Lars Povlsen from Microchip Sparx5 SoC
MAINTAINERS: remove Noam Dagan from AMAZON ETHERNET
MAINTAINERS: remove Ying Xue from TIPC
MAINTAINERS: remove Mark Lee from MediaTek Ethernet
MAINTAINERS: mark stmmac ethernet as an Orphan
MAINTAINERS: remove Andy Gospodarek from bonding
MAINTAINERS: update maintainers for Microchip LAN78xx
MAINTAINERS: mark Synopsys DW XPCS as Orphan
net/mlx5: Fix variable not being completed when function returns
rtase: Fix a check for error in rtase_alloc_msix()
net: stmmac: dwmac-tegra: Read iommu stream id from device tree
...
This contains a pair of fixes for the vector self tests, which avoids
some warnings and provides proper status messages.
* b4-shazam-merge:
tools: selftests: riscv: Add test count for vstate_prctl
tools: selftests: riscv: Add pass message for v_initval_nolibc
Link: https://lore.kernel.org/r/20241220091730.28006-1-yongxuan.wang@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Pull hotfixes from Andrew Morton:
"25 hotfixes. 16 are cc:stable. 18 are MM and 7 are non-MM.
The usual bunch of singletons and two doubletons - please see the
relevant changelogs for details"
* tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
MAINTAINERS: change Arınç _NAL's name and email address
scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity
mm/util: make memdup_user_nul() similar to memdup_user()
mm, madvise: fix potential workingset node list_lru leaks
mm/damon/core: fix ignored quota goals and filters of newly committed schemes
mm/damon/core: fix new damon_target objects leaks on damon_commit_targets()
mm/list_lru: fix false warning of negative counter
vmstat: disable vmstat_work on vmstat_cpu_down_prep()
mm: shmem: fix the update of 'shmem_falloc->nr_unswapped'
mm: shmem: fix incorrect index alignment for within_size policy
percpu: remove intermediate variable in PERCPU_PTR()
mm: zswap: fix race between [de]compression and CPU hotunplug
ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit
kcov: mark in_softirq_really() as __always_inline
docs: mm: fix the incorrect 'FileHugeMapped' field
mailmap: modify the entry for Mathieu Othacehe
mm/kmemleak: fix sleeping function called from invalid context at print message
mm: hugetlb: independent PMD page table shared count
maple_tree: reload mas before the second call for mas_empty_area
...
Pull sched_ext fixes from Tejun Heo:
- Fix a bug where bpf_iter_scx_dsq_new() was not initializing the
iterator's flags and could inadvertently enable e.g. reverse
iteration
- Fix a bug where scx_ops_bypass() could call irq_restore twice
- Add Andrea and Changwoo as maintainers for better review coverage
- selftests and tools/sched_ext build and other fixes
* tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Fix dsq_local_on selftest
sched_ext: initialize kit->cursor.flags
sched_ext: Fix invalid irq restore in scx_ops_bypass()
MAINTAINERS: add me as reviewer for sched_ext
MAINTAINERS: add self as reviewer for sched_ext
scx: Fix maximal BPF selftest prog
sched_ext: fix application of sizeof to pointer
selftests/sched_ext: fix build after renames in sched_ext API
sched_ext: Add __weak to fix the build errors
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireles and netfilter.
Nothing major here. Over the last two weeks we gathered only around
two-thirds of our normal weekly fix count, but delaying sending these
until -rc7 seemed like a really bad idea.
AFAIK we have no bugs under investigation. One or two reverts for
stuff for which we haven't gotten a proper fix will likely come in the
next PR.
Current release - fix to a fix:
- netfilter: nft_set_hash: unaligned atomic read on struct
nft_set_ext
- eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
Previous releases - regressions:
- net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
- mptcp:
- fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust
- fix TCP options overflow
- prevent excessive coalescing on receive, fix throughput
- net: fix memory leak in tcp_conn_request() if map insertion fails
- wifi: cw1200: fix potential NULL dereference after conversion to
GPIO descriptors
- phy: micrel: dynamically control external clock of KSZ PHY, fix
suspend behavior
Previous releases - always broken:
- af_packet: fix VLAN handling with MSG_PEEK
- net: restrict SO_REUSEPORT to inet sockets
- netdev-genl: avoid empty messages in NAPI get
- dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X
- eth:
- gve: XDP fixes around transmit, queue wakeup etc.
- ti: icssg-prueth: fix firmware load sequence to prevent time
jump which breaks timesync related operations
Misc:
- netlink: specs: mptcp: add missing attr and improve documentation"
* tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
net: ti: icssg-prueth: Fix firmware load sequence.
mptcp: prevent excessive coalescing on receive
mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
mptcp: fix recvbuffer adjust on sleeping rcvmsg
ila: serialize calls to nf_register_net_hooks()
af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
af_packet: fix vlan_get_tci() vs MSG_PEEK
net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
net: restrict SO_REUSEPORT to inet sockets
net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
net: wwan: t7xx: Fix FSM command timeout issue
sky2: Add device ID 11ab:4373 for Marvell 88E8075
mptcp: fix TCP options overflow.
net: mv643xx_eth: fix an OF node reference leak
gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
eth: bcmsysport: fix call balance of priv->clk handling routines
net: llc: reset skb->transport_header
netlink: specs: mptcp: fix missing doc
...
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. Nothing really stands out, fortunately.
- Follow-up fixes for the new compress offload API extension
- A few ASoC SOF, AMD and Mediatek quirks and fixes
- A regression fix in legacy SH driver cleanup
- Fix DMA mapping error handling in the helper code
- Fix kselftest dependency"
* tag 'sound-6.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: sh: Fix wrong argument order for copy_from_iter()
selftests/alsa: Fix circular dependency involving global-timer
ALSA: memalloc: prefer dma_mapping_error() over explicit address checking
ALSA: compress_offload: improve file descriptors installation for dma-buf
ALSA: compress_offload: use safe list iteration in snd_compr_task_seq()
ALSA: compress_offload: avoid 64-bit get_user()
ALSA: compress_offload: import DMA_BUF namespace
ASoC: mediatek: disable buffer pre-allocation
ASoC: rt722: add delay time to wait for the calibration procedure
ASoC: SOF: Intel: hda-dai: Do not release the link DMA on STOP
ASoC: dt-bindings: realtek,rt5645: Fix CPVDD voltage comment
ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21QA and 21QB
ASoC: Intel: sof_sdw: Fix DMI match for Lenovo 21Q6 and 21Q7
ASoC: amd: ps: Fix for enabling DMIC on acp63 platform via _DSD entry
The dsp_local_on selftest expects the scheduler to fail by trying to
schedule an e.g. CPU-affine task to the wrong CPU. However, this isn't
guaranteed to happen in the 1 second window that the test is running.
Besides, it's odd to have this particular exception path tested when there
are no other tests that verify that the interface is working at all - e.g.
the test would pass if dsp_local_on interface is completely broken and fails
on any attempt.
Flip the test so that it verifies that the feature works. While at it, fix a
typo in the info message.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Ihor Solodrai <ihor.solodrai@pm.me>
Link: http://lkml.kernel.org/r/Z1n9v7Z6iNJ-wKmq@slm.duckdns.org
Signed-off-by: Tejun Heo <tj@kernel.org>