[ Upstream commit 9492261ff2460252cf2d8de89cdf854c7e2b28a0 ]
When we started spreading new inode numbers throughout most of the 64
bit inode space, that triggered some corner case bugs, in particular
some integer overflows related to the radix tree code. Oops.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 24de14c98b37ea40a7e493dfd0d93b400b6efbca ]
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the filtered_suite and filtered_suite->test_cases
allocated in the last kunit_filter_attr_tests() in last inner for loop
is leaked.
So add a new free_filtered_suite err label and free the filtered_suite
and filtered_suite->test_cases so far. And change kmalloc_array of copy
to kcalloc to Clear the copy to make the kfree safe.
Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e44679515a7b803cf0143dc9de3d2ecbe907f939 ]
If the outer layer for loop is iterated more than once and it fails not
in the first iteration, the copy pointer has been moved. So it should free
the original copy's backup copy_start.
Fixes: abbf73816b ("kunit: fix possible memory leak in kunit_filter_suites()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Users complained about OOM errors during fork without triggering
compaction. This can be fixed by modifying the flags used in
mas_expected_entries() so that the compaction will be triggered in low
memory situations. Since mas_expected_entries() is only used during fork,
the extra argument does not need to be passed through.
Additionally, the two test_maple_tree test cases and one benchmark test
were altered to use the correct locking type so that allocations would not
trigger sleeping and thus fail. Testing was completed with lockdep atomic
sleep detection.
The additional locking change requires rwsem support additions to the
tools/ directory through the use of pthreads pthread_rwlock_t. With this
change test_maple_tree works in userspace, as a module, and in-kernel.
Users may notice that the system gave up early on attempting to start new
processes instead of attempting to reclaim memory.
Link: https://lkml.kernel.org/r/20230915093243epcms1p46fa00bbac1ab7b7dca94acb66c44c456@epcms1p4
Link: https://lkml.kernel.org/r/20231012155233.2272446-1-Liam.Howlett@oracle.com
Fixes: 54a611b605 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reviewed-by: Peng Zhang <zhangpeng.00@bytedance.com>
Cc: <jason.sim@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When updating the maple tree iterator to avoid rewalks, an issue was
introduced when shifting beyond the limits. This can be seen by trying to
go to the previous address of 0, which would set the maple node to
MAS_NONE and keep the range as the last entry.
Subsequent calls to mas_find() would then search upwards from mas->last
and skip the value at mas->index/mas->last. This showed up as a bug in
mprotect which skips the actual VMA at the current range after attempting
to go to the previous VMA from 0.
Since MAS_NONE may already be set when searching for a value that isn't
contained within a node, changing the handling of MAS_NONE in mas_find()
would make the code more complicated and error prone. Furthermore, there
was no way to tell which limit was hit, and thus which action to take
(next or the entry at the current range).
This solution is to add two states to track what happened with the
previous iterator action. This allows for the expected behaviour of the
next command to return the correct item (either the item at the range
requested, or the next/previous).
Tests are also added and updated accordingly.
Link: https://lkml.kernel.org/r/20230921181236.509072-3-Liam.Howlett@oracle.com
Link: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Link: https://lore.kernel.org/linux-mm/20230921181236.509072-1-Liam.Howlett@oracle.com/
Fixes: 39193685d5 ("maple_tree: try harder to keep active node with mas_prev()")
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Pedro Falcato <pedro.falcato@gmail.com>
Closes: https://gist.github.com/heatd/85d2971fae1501b55b6ea401fbbe485b
Closes: https://bugs.archlinux.org/task/79656
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Describe missing function parameters to prevent kernel-doc warnings:
lib/scatterlist.c:288: warning: Function parameter or member 'first_chunk' not described in '__sg_alloc_table'
lib/scatterlist.c:800: warning: Function parameter or member 'flags' not described in 'sg_miter_start'
Link: https://lkml.kernel.org/r/20230912060848.4673-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull kunit fixes from Shuah Khan:
"Fixes to possible memory leak, null-ptr-deref, wild-memory-access, and
error path bugs"
* tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Fix possible memory leak in kunit_filter_suites()
kunit: Fix possible null-ptr-deref in kunit_parse_glob_filter()
kunit: Fix the wrong err path and add goto labels in kunit_filter_suites()
kunit: Fix wild-memory-access bug in kunit_free_suite_set()
kunit: test: Make filter strings in executor_test writable
iov_iter_extract_pages() doesn't correctly handle skipping over initial
zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.
The problem is that it accidentally reduces maxsize to 0 when it
skipping and thus runs to the end of the array and returns 0.
Fix this by sticking the calculated size-to-copy in a new variable
rather than back in maxsize.
Fixes: 7d58fe7310 ("iov_iter: Add a function to extract a page list from an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull xarray fixes from Matthew Wilcox:
- Fix a bug encountered by people using bittorrent where they'd get
NULL pointer dereferences on page cache lookups when using XFS
- Two documentation fixes
* tag 'xarray-6.6' of git://git.infradead.org/users/willy/xarray:
idr: fix param name in idr_alloc_cyclic() doc
xarray: Document necessary flag in alloc functions
XArray: Do not return sibling entries from xa_load()
Pull LoongArch updates from Huacai Chen:
- Allow usage of LSX/LASX in the kernel, and use them for
SIMD-optimized RAID5/RAID6 routines
- Add Loongson Binary Translation (LBT) extension support
- Add basic KGDB & KDB support
- Add building with kcov coverage
- Add KFENCE (Kernel Electric-Fence) support
- Add KASAN (Kernel Address Sanitizer) support
- Some bug fixes and other small changes
- Update the default config file
* tag 'loongarch-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (25 commits)
LoongArch: Update Loongson-3 default config file
LoongArch: Add KASAN (Kernel Address Sanitizer) support
LoongArch: Simplify the processing of jumping new kernel for KASLR
kasan: Add (pmd|pud)_init for LoongArch zero_(pud|p4d)_populate process
kasan: Add __HAVE_ARCH_SHADOW_MAP to support arch specific mapping
LoongArch: Add KFENCE (Kernel Electric-Fence) support
LoongArch: Get partial stack information when providing regs parameter
LoongArch: mm: Add page table mapped mode support for virt_to_page()
kfence: Defer the assignment of the local variable addr
LoongArch: Allow building with kcov coverage
LoongArch: Provide kaslr_offset() to get kernel offset
LoongArch: Add basic KGDB & KDB support
LoongArch: Add Loongson Binary Translation (LBT) extension support
raid6: Add LoongArch SIMD recovery implementation
raid6: Add LoongArch SIMD syndrome calculation
LoongArch: Add SIMD-optimized XOR routines
LoongArch: Allow usage of LSX/LASX in the kernel
LoongArch: Define symbol 'fault' as a local label in fpu.S
LoongArch: Adjust {copy, clear}_user exception handler behavior
LoongArch: Use static defined zero page rather than allocated
...
Similar to the syndrome calculation, the recovery algorithms also work
on 64 bytes at a time to align with the L1 cache line size of current
and future LoongArch cores (that we care about). Which means
unrolled-by-4 LSX and unrolled-by-2 LASX code.
The assembly is originally based on the x86 SSSE3/AVX2 ports, but
register allocation has been redone to take advantage of LSX/LASX's 32
vector registers, and instruction sequence has been optimized to suit
(e.g. LoongArch can perform per-byte srl and andi on vectors, but x86
cannot).
Performance numbers measured by instrumenting the raid6test code, on a
3A5000 system clocked at 2.5GHz:
> lasx 2data: 354.987 MiB/s
> lasx datap: 350.430 MiB/s
> lsx 2data: 340.026 MiB/s
> lsx datap: 337.318 MiB/s
> intx1 2data: 164.280 MiB/s
> intx1 datap: 187.966 MiB/s
Because recovery algorithms are chosen solely based on priority and
availability, lasx is marked as priority 2 and lsx priority 1. At least
for the current generation of LoongArch micro-architectures, LASX should
always be faster than LSX whenever supported, and have similar power
consumption characteristics (because the only known LASX-capable uarch,
the LA464, always compute the full 256-bit result for vector ops).
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The algorithms work on 64 bytes at a time, which is the L1 cache line
size of all current and future LoongArch cores (that we care about), as
confirmed by Huacai. The code is based on the generic int.uc algorithm,
unrolled 4 times for LSX and 2 times for LASX. Further unrolling does
not meaningfully improve the performance according to experiments.
Performance numbers measured during system boot on a 3A5000 @ 2.5GHz:
> raid6: lasx gen() 12726 MB/s
> raid6: lsx gen() 10001 MB/s
> raid6: int64x8 gen() 2876 MB/s
> raid6: int64x4 gen() 3867 MB/s
> raid6: int64x2 gen() 2531 MB/s
> raid6: int64x1 gen() 1945 MB/s
Comparison of xor() speeds (from different boots but meaningful anyway):
> lasx: 11226 MB/s
> lsx: 6395 MB/s
> int64x4: 2147 MB/s
Performance as measured by raid6test:
> raid6: lasx gen() 25109 MB/s
> raid6: lsx gen() 13233 MB/s
> raid6: int64x8 gen() 4164 MB/s
> raid6: int64x4 gen() 6005 MB/s
> raid6: int64x2 gen() 5781 MB/s
> raid6: int64x1 gen() 4119 MB/s
> raid6: using algorithm lasx gen() 25109 MB/s
> raid6: .... xor() 14439 MB/s, rmw enabled
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The relevant parameter is 'start' and not 'nextid'
Fixes: 460488c58c ("idr: Remove idr_alloc_ext")
Signed-off-by: Ariel Marcovitch <arielmarcovitch@gmail.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Adds a new line to the docstrings of functions wrapping __xa_alloc() and
__xa_alloc_cyclic(), informing about the necessity of flag XA_FLAGS_ALLOC
being set previously.
The documentation so far says that functions wrapping __xa_alloc() and
__xa_alloc_cyclic() are supposed to return either -ENOMEM or -EBUSY in
case of an error. If the xarray has been initialized without the flag
XA_FLAGS_ALLOC, however, they fail with a different, undocumented error
code.
As hinted at in Documentation/core-api/xarray.rst, wrappers around these
functions should only be invoked when the flag has been set. The
functions' documentation should reflect that as well.
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
If both filter_glob and filters are not NULL, and kunit_parse_glob_filter()
succeed, but kcalloc parsed_filters fails, the suite_glob and test_glob of
parsed kzalloc in kunit_parse_glob_filter() will be leaked.
As Rae suggested, assign -ENOMEM to *err to correctly free copy and goto
free_parsed_glob to free the suite/test_glob of parsed.
Fixes: 1c9fd080df ("kunit: fix uninitialized variables bug in attributes filtering")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Take the last kfree(parsed_filters) and add it to be the first. Take
the first kfree(copy) and add it to be the last. The Best practice is to
return these errors reversely.
And as David suggested, add several labels which target only the things
which actually have been allocated so far.
Fixes: 529534e8cb ("kunit: Add ability to filter attributes")
Fixes: abbf73816b ("kunit: fix possible memory leak in kunit_filter_suites()")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reviewed-by: Rae Moar <rmoar@google.com>
Suggested-by: David Gow <davidgow@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Pull printk updates from Petr Mladek:
- Do not try to get the console lock when it is not need or useful in
panic()
- Replace the global console_suspended state by a per-console flag
- Export symbols needed for dumping the raw printk buffer in panic()
- Fix documentation of printf formats for integer types
- Moved Sergey Senozhatsky to the reviewer role
- Misc cleanups
* tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: export symbols for debug modules
lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
printk: ringbuffer: Fix truncating buffer size min_t cast
printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
printk: Add per-console suspended state
printk: Consolidate console deferred printing
printk: Do not take console lock for console_flush_on_panic()
printk: Keep non-panic-CPUs out of console lock
printk: Reduce console_unblank() usage in unsafe scenarios
kdb: Do not assume write() callback available
docs: printk-formats: Treat char as always unsigned
docs: printk-formats: Fix hex printing of signed values
MAINTAINERS: adjust printk/vsprintf entries
Pull percpu updates from Dennis Zhou:
"One bigger change to percpu_counter's api allowing for init and
destroy of multiple counters via percpu_counter_init_many() and
percpu_counter_destroy_many(). This is used to help begin remediating
a performance regression with percpu rss stats.
Additionally, it seems larger core count machines are feeling the
burden of the single threaded allocation of percpu. Mateusz is
thinking about it and I will spend some time on it too.
percpu:
- A couple cleanups by Baoquan He and Bibo Mao. The only behavior
change is to start printing messages if we're under the warn limit
for failed atomic allocations.
percpu_counter:
- Shakeel introduced percpu counters into mm_struct which caused
percpu allocations be on the hot path [1]. Originally I spent some
time trying to improve the percpu allocator, but instead preferred
what Mateusz Guzik proposed grouping at the allocation site,
percpu_counter_init_many(). This allows a single percpu allocation
to be shared by the counters. I like this approach because it
creates a shared lifetime by the allocations. Additionally, I
believe many inits have higher level synchronization requirements,
like percpu_counter does against HOTPLUG_CPU. Therefore we can
group these optimizations together"
Link: https://lore.kernel.org/linux-mm/20221024052841.3291983-1-shakeelb@google.com/ [1]
* tag 'percpu-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
kernel/fork: group allocation/free of per-cpu counters for mm struct
pcpcntr: add group allocation/free
mm/percpu.c: print error message too if atomic alloc failed
mm/percpu.c: optimize the code in pcpu_setup_first_chunk() a little bit
mm/percpu.c: remove redundant check
mm/percpu: Remove some local variables in pcpu_populate_pte