mirror of
https://github.com/ukui/kernel.git
synced 2026-03-09 10:07:04 -07:00
72f039db491e59396edbaa39595d0865aee055ee
329 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
d8cc323d95 |
mm/vmalloc: fix a typo in comment
There is a typo in comment, fix it. "exeeds" -> "exceeds" Signed-off-by: Qiujun Huang <hqjagain@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200404060136.10838-1-hqjagain@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
763802b53a |
x86/mm: split vmalloc_sync_all()
Commit |
||
|
|
a786810cc8 |
Merge tag 'v5.5-rc7' into efi/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
8e57f8acbb |
mm, debug_pagealloc: don't rely on static keys too early
Commit |
||
|
|
57ad87ddce |
Merge branch 'x86/mm' into efi/core, to pick up dependencies
Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
|
253a496d8e |
kasan: don't assume percpu shadow allocations will succeed
syzkaller and the fault injector showed that I was wrong to assume that
we could ignore percpu shadow allocation failures.
Handle failures properly. Merge all the allocated areas back into the
free list and release the shadow, then clean up and return NULL. The
shadow is released unconditionally, which relies upon the fact that the
release function is able to tolerate pages not being present.
Also clean up shadows in the recovery path - currently they are not
released, which leaks a bit of memory.
Link: http://lkml.kernel.org/r/20191205140407.1874-3-dja@axtens.net
Fixes:
|
||
|
|
d98c9e83b5 |
kasan: fix crashes on access to memory mapped by vm_map_ram()
With CONFIG_KASAN_VMALLOC=y any use of memory obtained via vm_map_ram()
will crash because there is no shadow backing that memory.
Instead of sprinkling additional kasan_populate_vmalloc() calls all over
the vmalloc code, move it into alloc_vmap_area(). This will fix
vm_map_ram() and simplify the code a bit.
[aryabinin@virtuozzo.com: v2]
Link: http://lkml.kernel.org/r/20191205095942.1761-1-aryabinin@virtuozzo.comLink: http://lkml.kernel.org/r/20191204204534.32202-1-aryabinin@virtuozzo.com
Fixes:
|
||
|
|
186525bd6b |
mm, x86/mm: Untangle address space layout definitions from basic pgtable type definitions
- Untangle the somewhat incestous way of how VMALLOC_START is used all across the
kernel, but is, on x86, defined deep inside one of the lowest level page table headers.
It doesn't help that vmalloc.h only includes a single asm header:
#include <asm/page.h> /* pgprot_t */
So there was no existing cross-arch way to decouple address layout
definitions from page.h details. I used this:
#ifndef VMALLOC_START
# include <asm/vmalloc.h>
#endif
This way every architecture that wants to simplify page.h can do so.
- Also on x86 we had a couple of LDT related inline functions that used
the late-stage address space layout positions - but these could be
uninlined without real trouble - the end result is cleaner this way as
well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
||
|
|
3c5c3cfb9e |
kasan: support backing vmalloc space with real shadow memory
Patch series "kasan: support backing vmalloc space with real shadow
memory", v11.
Currently, vmalloc space is backed by the early shadow page. This means
that kasan is incompatible with VMAP_STACK.
This series provides a mechanism to back vmalloc space with real,
dynamically allocated memory. I have only wired up x86, because that's
the only currently supported arch I can work with easily, but it's very
easy to wire up other architectures, and it appears that there is some
work-in-progress code to do this on arm64 and s390.
This has been discussed before in the context of VMAP_STACK:
- https://bugzilla.kernel.org/show_bug.cgi?id=202009
- https://lkml.org/lkml/2018/7/22/198
- https://lkml.org/lkml/2019/7/19/822
In terms of implementation details:
Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE.
Instead, share backing space across multiple mappings. Allocate a
backing page when a mapping in vmalloc space uses a particular page of
the shadow region. This page can be shared by other vmalloc mappings
later on.
We hook in to the vmap infrastructure to lazily clean up unused shadow
memory.
Testing with test_vmalloc.sh on an x86 VM with 2 vCPUs shows that:
- Turning on KASAN, inline instrumentation, without vmalloc, introuduces
a 4.1x-4.2x slowdown in vmalloc operations.
- Turning this on introduces the following slowdowns over KASAN:
* ~1.76x slower single-threaded (test_vmalloc.sh performance)
* ~2.18x slower when both cpus are performing operations
simultaneously (test_vmalloc.sh sequential_test_order=1)
This is unfortunate but given that this is a debug feature only, not the
end of the world. The benchmarks are also a stress-test for the vmalloc
subsystem: they're not indicative of an overall 2x slowdown!
This patch (of 4):
Hook into vmalloc and vmap, and dynamically allocate real shadow memory
to back the mappings.
Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE.
Instead, share backing space across multiple mappings. Allocate a
backing page when a mapping in vmalloc space uses a particular page of
the shadow region. This page can be shared by other vmalloc mappings
later on.
We hook in to the vmap infrastructure to lazily clean up unused shadow
memory.
To avoid the difficulties around swapping mappings around, this code
expects that the part of the shadow region that covers the vmalloc space
will not be covered by the early shadow page, but will be left unmapped.
This will require changes in arch-specific code.
This allows KASAN with VMAP_STACK, and may be helpful for architectures
that do not have a separate module space (e.g. powerpc64, which I am
currently working on). It also allows relaxing the module alignment
back to PAGE_SIZE.
Testing with test_vmalloc.sh on an x86 VM with 2 vCPUs shows that:
- Turning on KASAN, inline instrumentation, without vmalloc, introuduces
a 4.1x-4.2x slowdown in vmalloc operations.
- Turning this on introduces the following slowdowns over KASAN:
* ~1.76x slower single-threaded (test_vmalloc.sh performance)
* ~2.18x slower when both cpus are performing operations
simultaneously (test_vmalloc.sh sequential_test_order=3D1)
This is unfortunate but given that this is a debug feature only, not the
end of the world.
The full benchmark results are:
Performance
No KASAN KASAN original x baseline KASAN vmalloc x baseline x KASAN
fix_size_alloc_test 662004 11404956 17.23 19144610 28.92 1.68
full_fit_alloc_test 710950 12029752 16.92 13184651 18.55 1.10
long_busy_list_alloc_test 9431875 43990172 4.66 82970178 8.80 1.89
random_size_alloc_test 5033626 23061762 4.58 47158834 9.37 2.04
fix_align_alloc_test 1252514 15276910 12.20 31266116 24.96 2.05
random_size_align_alloc_te 1648501 14578321 8.84 25560052 15.51 1.75
align_shift_alloc_test 147 830 5.65 5692 38.72 6.86
pcpu_alloc_test 80732 125520 1.55 140864 1.74 1.12
Total Cycles 119240774314 763211341128 6.40 1390338696894 11.66 1.82
Sequential, 2 cpus
No KASAN KASAN original x baseline KASAN vmalloc x baseline x KASAN
fix_size_alloc_test
|
||
|
|
e36176be1c |
mm/vmalloc: rework vmap_area_lock
With the new allocation approach introduced in the 5.2 kernel, it
becomes possible to get rid of one global spinlock. By doing that we
can further improve the KVA from the performance point of view.
Basically we can have two independent locks, one for allocation part and
another one for deallocation, because of two different entities: "free
data structures" and "busy data structures".
As a result, allocation/deallocation operations can still interfere
between each other in case of running simultaneously on different CPUs,
it means there is still dependency, but with two locks it becomes lower.
Summarizing:
- it reduces the high lock contention
- it allows to perform operations on "free" and "busy"
trees in parallel on different CPUs. Please note it
does not solve scalability issue.
Test results:
In order to evaluate this patch, we can run "vmalloc test driver" to see
how many CPU cycles it takes to complete all test cases running
sequentially. All online CPUs run it so it will cause a high lock
contention.
HiKey 960, ARM64, 8xCPUs, big.LITTLE:
<snip>
sudo ./test_vmalloc.sh sequential_test_order=1
<snip>
<default>
[ 390.950557] All test took CPU0=457126382 cycles
[ 391.046690] All test took CPU1=454763452 cycles
[ 391.128586] All test took CPU2=454539334 cycles
[ 391.222669] All test took CPU3=455649517 cycles
[ 391.313946] All test took CPU4=388272196 cycles
[ 391.410425] All test took CPU5=384036264 cycles
[ 391.492219] All test took CPU6=387432964 cycles
[ 391.578433] All test took CPU7=387201996 cycles
<default>
<patched>
[ 304.721224] All test took CPU0=391521310 cycles
[ 304.821219] All test took CPU1=393533002 cycles
[ 304.917120] All test took CPU2=392243032 cycles
[ 305.008986] All test took CPU3=392353853 cycles
[ 305.108944] All test took CPU4=297630721 cycles
[ 305.196406] All test took CPU5=297548736 cycles
[ 305.288602] All test took CPU6=297092392 cycles
[ 305.381088] All test took CPU7=297293597 cycles
<patched>
~14%-23% patched variant is better.
Link: http://lkml.kernel.org/r/20191022155800.20468-1-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
060650a2a0 |
mm/vmalloc: add more comments to the adjust_va_to_fit_type()
When fit type is NE_FIT_TYPE there is a need in one extra object. Usually the "ne_fit_preload_node" per-CPU variable has it and there is no need in GFP_NOWAIT allocation, but there are exceptions. This commit just adds more explanations, as a result giving answers on questions like when it can occur, how often, under which conditions and what happens if GFP_NOWAIT gets failed. Link: http://lkml.kernel.org/r/20191016095438.12391-3-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f07116d77b |
mm/vmalloc: respect passed gfp_mask when doing preloading
Allocation functions should comply with the given gfp_mask as much as possible. The preallocation code in alloc_vmap_area doesn't follow that pattern and it is using a hardcoded GFP_KERNEL. Although this doesn't really make much difference because vmalloc is not GFP_NOWAIT compliant in general (e.g. page table allocations are GFP_KERNEL) there is no reason to spread that bad habit and it is good to fix the antipattern. [mhocko@suse.com: rewrite changelog] Link: http://lkml.kernel.org/r/20191016095438.12391-2-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
81f1ba586e |
mm/vmalloc: remove preempt_disable/enable when doing preloading
Some background. The preemption was disabled before to guarantee that a
preloaded object is available for a CPU, it was stored for. That was
achieved by combining the disabling the preemption and taking the spin
lock while the ne_fit_preload_node is checked.
The aim was to not allocate in atomic context when spinlock is taken
later, for regular vmap allocations. But that approach conflicts with
CONFIG_PREEMPT_RT philosophy. It means that calling spin_lock() with
disabled preemption is forbidden in the CONFIG_PREEMPT_RT kernel.
Therefore, get rid of preempt_disable() and preempt_enable() when the
preload is done for splitting purpose. As a result we do not guarantee
now that a CPU is preloaded, instead we minimize the case when it is
not, with this change, by populating the per cpu preload pointer under
the vmap_area_lock.
This implies that at least each caller that has done the preallocation
will not fallback to an atomic allocation later. It is possible that
the preallocation would be pointless or that no preallocation is done
because of the race but the data shows that this is really rare.
For example i run the special test case that follows the preload pattern
and path. 20 "unbind" threads run it and each does 1000000 allocations.
Only 3.5 times among 1000000 a CPU was not preloaded. So it can happen
but the number is negligible.
[mhocko@suse.com: changelog additions]
Link: http://lkml.kernel.org/r/20191016095438.12391-1-urezki@gmail.com
Fixes:
|
||
|
|
dcf61ff06d |
mm/vmalloc.c: remove unnecessary highmem_mask from parameter of gfpflags_allow_blocking()
gfpflags_allow_blocking() does not care about __GFP_HIGHMEM, so highmem_mask can be removed. Link: http://lkml.kernel.org/r/1568812319-3467-1-git-send-email-liuxiang_1999@126.com Signed-off-by: Liu Xiang <liuxiang_1999@126.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
fc9702273e |
bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY
Add ability to memory-map contents of BPF array map. This is extremely useful
for working with BPF global data from userspace programs. It allows to avoid
typical bpf_map_{lookup,update}_elem operations, improving both performance
and usability.
There had to be special considerations for map freezing, to avoid having
writable memory view into a frozen map. To solve this issue, map freezing and
mmap-ing is happening under mutex now:
- if map is already frozen, no writable mapping is allowed;
- if map has writable memory mappings active (accounted in map->writecnt),
map freezing will keep failing with -EBUSY;
- once number of writable memory mappings drops to zero, map freezing can be
performed again.
Only non-per-CPU plain arrays are supported right now. Maps with spinlocks
can't be memory mapped either.
For BPF_F_MMAPABLE array, memory allocation has to be done through vmalloc()
to be mmap()'able. We also need to make sure that array data memory is
page-sized and page-aligned, so we over-allocate memory in such a way that
struct bpf_array is at the end of a single page of memory with array->value
being aligned with the start of the second page. On deallocation we need to
accomodate this memory arrangement to free vmalloc()'ed memory correctly.
One important consideration regarding how memory-mapping subsystem functions.
Memory-mapping subsystem provides few optional callbacks, among them open()
and close(). close() is called for each memory region that is unmapped, so
that users can decrease their reference counters and free up resources, if
necessary. open() is *almost* symmetrical: it's called for each memory region
that is being mapped, **except** the very first one. So bpf_map_mmap does
initial refcnt bump, while open() will do any extra ones after that. Thus
number of close() calls is equal to number of open() calls plus one more.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Link: https://lore.kernel.org/bpf/20191117172806.2195367-4-andriin@fb.com
|
||
|
|
315cc066b8 |
augmented rbtree: add new RB_DECLARE_CALLBACKS_MAX macro
Add RB_DECLARE_CALLBACKS_MAX, which generates augmented rbtree callbacks for the case where the augmented value is a scalar whose definition follows a max(f(node)) pattern. This actually covers all present uses of RB_DECLARE_CALLBACKS, and saves some (source) code duplication in the various RBCOMPUTE function definitions. [walken@google.com: fix mm/vmalloc.c] Link: http://lkml.kernel.org/r/CANN689FXgK13wDYNh1zKxdipeTuALG4eKvKpsdZqKFJ-rvtGiQ@mail.gmail.com [walken@google.com: re-add check to check_augmented()] Link: http://lkml.kernel.org/r/20190727022027.GA86863@google.com Link: http://lkml.kernel.org/r/20190703040156.56953-3-walken@google.com Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dbueso@suse.de> Cc: Uladzislau Rezki <urezki@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
7ea362427c |
mm/vmalloc.c: move 'area->pages' after if statement
If !area->pages statement is true where memory allocation fails, area is freed. In this case 'area->pages = pages' should not executed. So move 'area->pages = pages' after if statement. [akpm@linux-foundation.org: give area->pages the same treatment] Link: http://lkml.kernel.org/r/20190830035716.GA190684@LGEARND20B15 Signed-off-by: Austin Kim <austindh.kim@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Roman Penyaev <rpenyaev@suse.de> Cc: Rick Edgecombe <rick.p.edgecombe@intel.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
688fcbfc06 |
mm/vmalloc: modify struct vmap_area to reduce its size
Objective --------- The current implementation of struct vmap_area wasted space. After applying this commit, sizeof(struct vmap_area) has been reduced from 11 words to 8 words. Description ----------- 1) Pack "subtree_max_size", "vm" and "purge_list". This is no problem because A) "subtree_max_size" is only used when vmap_area is in "free" tree B) "vm" is only used when vmap_area is in "busy" tree C) "purge_list" is only used when vmap_area is in vmap_purge_list 2) Eliminate "flags". ;Since only one flag VM_VM_AREA is being used, and the same thing can be done by judging whether "vm" is NULL, then the "flags" can be eliminated. Link: http://lkml.kernel.org/r/20190716152656.12255-3-lpf.vector@gmail.com Signed-off-by: Pengfei Li <lpf.vector@gmail.com> Suggested-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Roman Gushchin <guro@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
dd3b8353ba |
mm/vmalloc: do not keep unpurged areas in the busy tree
The busy tree can be quite big, even though the area is freed or unmapped it still stays there until "purge" logic removes it. 1) Optimize and reduce the size of "busy" tree by removing a node from it right away as soon as user triggers free paths. It is possible to do so, because the allocation is done using another augmented tree. The vmalloc test driver shows the difference, for example the "fix_size_alloc_test" is ~11% better comparing with default configuration: sudo ./test_vmalloc.sh performance <default> Summary: fix_size_alloc_test loops: 1000000 avg: 993985 usec Summary: full_fit_alloc_test loops: 1000000 avg: 973554 usec Summary: long_busy_list_alloc_test loops: 1000000 avg: 12617652 usec <default> <this patch> Summary: fix_size_alloc_test loops: 1000000 avg: 882263 usec Summary: full_fit_alloc_test loops: 1000000 avg: 973407 usec Summary: long_busy_list_alloc_test loops: 1000000 avg: 12593929 usec <this patch> 2) Since the busy tree now contains allocated areas only and does not interfere with lazily free nodes, introduce the new function show_purge_info() that dumps "unpurged" areas that is propagated through "/proc/vmallocinfo". 3) Eliminate VM_LAZY_FREE flag. Link: http://lkml.kernel.org/r/20190716152656.12255-2-lpf.vector@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Pengfei Li <lpf.vector@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
fe9041c245 |
vmalloc: lift the arm flag for coherent mappings to common code
The arm architecture had a VM_ARM_DMA_CONSISTENT flag to mark DMA coherent remapping for a while. Lift this flag to common code so that we can use it generically. We also check it in the only place VM_USERMAP is directly check so that we can entirely replace that flag as well (although I'm not even sure why we'd want to allow remapping DMA appings, but I'd rather not change behavior). Signed-off-by: Christoph Hellwig <hch@lst.de> |
||
|
|
5336e52c9e |
mm/vmalloc.c: fix percpu free VM area search criteria
Recent changes to the vmalloc code by commit |
||
|
|
3f8fd02b1b |
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.
When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.
This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.
Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.
References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes:
|
||
|
|
97105f0ab7 |
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
Vmalloc() is getting more and more used these days (kernel stacks, bpf and
percpu allocator are new top users), and the total % of memory consumed by
vmalloc() can be pretty significant and changes dynamically.
/proc/meminfo is the best place to display this information: its top goal
is to show top consumers of the memory.
Since the VmallocUsed field in /proc/meminfo is not in use for quite a
long time (it has been defined to 0 by
|
||
|
|
d9009d67f4 |
mm/vmalloc.c: spelling> s/informaion/information/
Link: http://lkml.kernel.org/r/20190607113509.15032-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Souptick Joarder <jrdr.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
460e42d19a |
mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va()
Trigger a warning if an object that is about to be freed is detached. We used to have a BUG_ON(), but even though it is considered as faulty behaviour that is not a good reason to break a system. Link: http://lkml.kernel.org/r/20190606120411.8298-5-urezki@gmail.com Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: Hillf Danton <hdanton@sina.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |