mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
2a24bb28a315ea2579fbf13a99a69a10cf4c085e
654 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
96db800f5d |
mm: rename alloc_pages_exact_node() to __alloc_pages_node()
alloc_pages_exact_node() was introduced in commit |
||
|
|
45eb00cd3a |
mm/slub: don't wait for high-order page allocation
Description is almost copied from commit
|
||
|
|
80da026a8e |
mm/slub: fix slab double-free in case of duplicate sysfs filename
sysfs_slab_add() shouldn't call kobject_put at error path: this puts last reference of kmem-cache kobject and frees it. Kmem cache will be freed second time at error path in kmem_cache_create(). For example this happens when slub debug was enabled in runtime and somebody creates new kmem cache: # echo 1 | tee /sys/kernel/slab/*/sanity_checks # modprobe configfs "configfs_dir_cache" cannot be merged because existing slab have debug and cannot create new slab because unique name ":t-0000096" already taken. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
588f8ba913 |
mm/slub: move slab initialization into irq enabled region
Initializing a new slab can introduce rather large latencies because most of the initialization runs always with interrupts disabled. There is no point in doing so. The newly allocated slab is not visible yet, so there is no reason to protect it against concurrent alloc/free. Move the expensive parts of the initialization into allocate_slab(), so for all allocations with GFP_WAIT set, interrupts are enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
3eed034d04 |
slub: add support for kmem_cache_debug in bulk calls
Per request of Joonsoo Kim adding kmem debug support. I've tested that when debugging is disabled, then there is almost no performance impact as this code basically gets removed by the compiler. Need some guidance in enabling and testing this. bulk- PREVIOUS - THIS-PATCH 1 - 43 cycles(tsc) 10.811 ns - 44 cycles(tsc) 11.236 ns improved -2.3% 2 - 27 cycles(tsc) 6.867 ns - 28 cycles(tsc) 7.019 ns improved -3.7% 3 - 21 cycles(tsc) 5.496 ns - 22 cycles(tsc) 5.526 ns improved -4.8% 4 - 24 cycles(tsc) 6.038 ns - 19 cycles(tsc) 4.786 ns improved 20.8% 8 - 17 cycles(tsc) 4.280 ns - 18 cycles(tsc) 4.572 ns improved -5.9% 16 - 17 cycles(tsc) 4.483 ns - 18 cycles(tsc) 4.658 ns improved -5.9% 30 - 18 cycles(tsc) 4.531 ns - 18 cycles(tsc) 4.568 ns improved 0.0% 32 - 58 cycles(tsc) 14.586 ns - 65 cycles(tsc) 16.454 ns improved -12.1% 34 - 53 cycles(tsc) 13.391 ns - 63 cycles(tsc) 15.932 ns improved -18.9% 48 - 65 cycles(tsc) 16.268 ns - 50 cycles(tsc) 12.506 ns improved 23.1% 64 - 53 cycles(tsc) 13.440 ns - 63 cycles(tsc) 15.929 ns improved -18.9% 128 - 79 cycles(tsc) 19.899 ns - 86 cycles(tsc) 21.583 ns improved -8.9% 158 - 90 cycles(tsc) 22.732 ns - 90 cycles(tsc) 22.552 ns improved 0.0% 250 - 95 cycles(tsc) 23.916 ns - 98 cycles(tsc) 24.589 ns improved -3.2% Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
fbd02630c6 |
slub: initial bulk free implementation
This implements SLUB specific kmem_cache_free_bulk(). SLUB allocator now
both have bulk alloc and free implemented.
Choose to reenable local IRQs while calling slowpath __slab_free(). In
worst case, where all objects hit slowpath call, the performance should
still be faster than fallback function __kmem_cache_free_bulk(), because
local_irq_{disable+enable} is very fast (7-cycles), while the fallback
invokes this_cpu_cmpxchg() which is slightly slower (9-cycles).
Nitpicking, this should be faster for N>=4, due to the entry cost of
local_irq_{disable+enable}.
Do notice that the save+restore variant is very expensive, this is key to
why this optimization works.
CPU: i7-4790K CPU @ 4.00GHz
* local_irq_{disable,enable}: 7 cycles(tsc) - 1.821 ns
* local_irq_{save,restore} : 37 cycles(tsc) - 9.443 ns
Measurements on CPU CPU i7-4790K @ 4.00GHz
Baseline normal fastpath (alloc+free cost): 43 cycles(tsc) 10.834 ns
Bulk- fallback - this-patch
1 - 58 cycles(tsc) 14.542 ns - 43 cycles(tsc) 10.811 ns improved 25.9%
2 - 50 cycles(tsc) 12.659 ns - 27 cycles(tsc) 6.867 ns improved 46.0%
3 - 48 cycles(tsc) 12.168 ns - 21 cycles(tsc) 5.496 ns improved 56.2%
4 - 47 cycles(tsc) 11.987 ns - 24 cycles(tsc) 6.038 ns improved 48.9%
8 - 46 cycles(tsc) 11.518 ns - 17 cycles(tsc) 4.280 ns improved 63.0%
16 - 45 cycles(tsc) 11.366 ns - 17 cycles(tsc) 4.483 ns improved 62.2%
30 - 45 cycles(tsc) 11.433 ns - 18 cycles(tsc) 4.531 ns improved 60.0%
32 - 75 cycles(tsc) 18.983 ns - 58 cycles(tsc) 14.586 ns improved 22.7%
34 - 71 cycles(tsc) 17.940 ns - 53 cycles(tsc) 13.391 ns improved 25.4%
48 - 80 cycles(tsc) 20.077 ns - 65 cycles(tsc) 16.268 ns improved 18.8%
64 - 71 cycles(tsc) 17.799 ns - 53 cycles(tsc) 13.440 ns improved 25.4%
128 - 91 cycles(tsc) 22.980 ns - 79 cycles(tsc) 19.899 ns improved 13.2%
158 - 100 cycles(tsc) 25.241 ns - 90 cycles(tsc) 22.732 ns improved 10.0%
250 - 102 cycles(tsc) 25.583 ns - 95 cycles(tsc) 23.916 ns improved 6.9%
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
ebe909e0fd |
slub: improve bulk alloc strategy
Call slowpath __slab_alloc() from within the bulk loop, as the side-effect
of this call likely repopulates c->freelist.
Choose to reenable local IRQs while calling slowpath.
Saving some optimizations for later. E.g. it is possible to extract
parts of __slab_alloc() and avoid the unnecessary and expensive (37
cycles) local_irq_{save,restore}. For now, be happy calling
__slab_alloc() this lower icache impact of this func and I don't have to
worry about correctness.
Measurements on CPU CPU i7-4790K @ 4.00GHz
Baseline normal fastpath (alloc+free cost): 42 cycles(tsc) 10.601 ns
Bulk- fallback - this-patch
1 - 58 cycles(tsc) 14.516 ns - 49 cycles(tsc) 12.459 ns improved 15.5%
2 - 51 cycles(tsc) 12.930 ns - 38 cycles(tsc) 9.605 ns improved 25.5%
3 - 49 cycles(tsc) 12.274 ns - 34 cycles(tsc) 8.525 ns improved 30.6%
4 - 48 cycles(tsc) 12.058 ns - 32 cycles(tsc) 8.036 ns improved 33.3%
8 - 46 cycles(tsc) 11.609 ns - 31 cycles(tsc) 7.756 ns improved 32.6%
16 - 45 cycles(tsc) 11.451 ns - 32 cycles(tsc) 8.148 ns improved 28.9%
30 - 79 cycles(tsc) 19.865 ns - 68 cycles(tsc) 17.164 ns improved 13.9%
32 - 76 cycles(tsc) 19.212 ns - 66 cycles(tsc) 16.584 ns improved 13.2%
34 - 74 cycles(tsc) 18.600 ns - 63 cycles(tsc) 15.954 ns improved 14.9%
48 - 88 cycles(tsc) 22.092 ns - 77 cycles(tsc) 19.373 ns improved 12.5%
64 - 80 cycles(tsc) 20.043 ns - 68 cycles(tsc) 17.188 ns improved 15.0%
128 - 99 cycles(tsc) 24.818 ns - 89 cycles(tsc) 22.404 ns improved 10.1%
158 - 99 cycles(tsc) 24.977 ns - 92 cycles(tsc) 23.089 ns improved 7.1%
250 - 106 cycles(tsc) 26.552 ns - 99 cycles(tsc) 24.785 ns improved 6.6%
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
994eb764ec |
slub bulk alloc: extract objects from the per cpu slab
First piece: acceleration of retrieval of per cpu objects If we are allocating lots of objects then it is advantageous to disable interrupts and avoid the this_cpu_cmpxchg() operation to get these objects faster. Note that we cannot do the fast operation if debugging is enabled, because we would have to add extra code to do all the debugging checks. And it would not be fast anyway. Note also that the requirement of having interrupts disabled avoids having to do processor flag operations. Allocate as many objects as possible in the fast way and then fall back to the generic implementation for the rest of the objects. Measurements on CPU CPU i7-4790K @ 4.00GHz Baseline normal fastpath (alloc+free cost): 42 cycles(tsc) 10.554 ns Bulk- fallback - this-patch 1 - 57 cycles(tsc) 14.432 ns - 48 cycles(tsc) 12.155 ns improved 15.8% 2 - 50 cycles(tsc) 12.746 ns - 37 cycles(tsc) 9.390 ns improved 26.0% 3 - 48 cycles(tsc) 12.180 ns - 33 cycles(tsc) 8.417 ns improved 31.2% 4 - 48 cycles(tsc) 12.015 ns - 32 cycles(tsc) 8.045 ns improved 33.3% 8 - 46 cycles(tsc) 11.526 ns - 30 cycles(tsc) 7.699 ns improved 34.8% 16 - 45 cycles(tsc) 11.418 ns - 32 cycles(tsc) 8.205 ns improved 28.9% 30 - 80 cycles(tsc) 20.246 ns - 73 cycles(tsc) 18.328 ns improved 8.8% 32 - 79 cycles(tsc) 19.946 ns - 72 cycles(tsc) 18.208 ns improved 8.9% 34 - 78 cycles(tsc) 19.659 ns - 71 cycles(tsc) 17.987 ns improved 9.0% 48 - 86 cycles(tsc) 21.516 ns - 82 cycles(tsc) 20.566 ns improved 4.7% 64 - 93 cycles(tsc) 23.423 ns - 89 cycles(tsc) 22.480 ns improved 4.3% 128 - 100 cycles(tsc) 25.170 ns - 99 cycles(tsc) 24.871 ns improved 1.0% 158 - 102 cycles(tsc) 25.549 ns - 101 cycles(tsc) 25.375 ns improved 1.0% 250 - 101 cycles(tsc) 25.344 ns - 100 cycles(tsc) 25.182 ns improved 1.0% Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
484748f0b6 |
slab: infrastructure for bulk object allocation and freeing
Add the basic infrastructure for alloc/free operations on pointer arrays. It includes a generic function in the common slab code that is used in this infrastructure patch to create the unoptimized functionality for slab bulk operations. Allocators can then provide optimized allocation functions for situations in which large numbers of objects are needed. These optimization may avoid taking locks repeatedly and bypass metadata creation if all objects in slab pages can be used to provide the objects required. Allocators can extend the skeletons provided and add their own code to the bulk alloc and free functions. They can keep the generic allocation and freeing and just fall back to those if optimizations would not work (like for example when debugging is on). Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
2ae44005b6 |
slub: fix spelling succedd to succeed
With this patchset the SLUB allocator now has both bulk alloc and free
implemented.
This patchset mostly optimizes the "fastpath" where objects are available
on the per CPU fastpath page. This mostly amortize the less-heavy
none-locked cmpxchg_double used on fastpath.
The "fallback" bulking (e.g __kmem_cache_free_bulk) provides a good basis
for comparison. Measurements[1] of the fallback functions
__kmem_cache_{free,alloc}_bulk have been copied from slab_common.c and
forced "noinline" to force a function call like slab_common.c.
Measurements on CPU CPU i7-4790K @ 4.00GHz
Baseline normal fastpath (alloc+free cost): 42 cycles(tsc) 10.601 ns
Measurements last-patch with disabled debugging:
Bulk- fallback - this-patch
1 - 57 cycles(tsc) 14.448 ns - 44 cycles(tsc) 11.236 ns improved 22.8%
2 - 51 cycles(tsc) 12.768 ns - 28 cycles(tsc) 7.019 ns improved 45.1%
3 - 48 cycles(tsc) 12.232 ns - 22 cycles(tsc) 5.526 ns improved 54.2%
4 - 48 cycles(tsc) 12.025 ns - 19 cycles(tsc) 4.786 ns improved 60.4%
8 - 46 cycles(tsc) 11.558 ns - 18 cycles(tsc) 4.572 ns improved 60.9%
16 - 45 cycles(tsc) 11.458 ns - 18 cycles(tsc) 4.658 ns improved 60.0%
30 - 45 cycles(tsc) 11.499 ns - 18 cycles(tsc) 4.568 ns improved 60.0%
32 - 79 cycles(tsc) 19.917 ns - 65 cycles(tsc) 16.454 ns improved 17.7%
34 - 78 cycles(tsc) 19.655 ns - 63 cycles(tsc) 15.932 ns improved 19.2%
48 - 68 cycles(tsc) 17.049 ns - 50 cycles(tsc) 12.506 ns improved 26.5%
64 - 80 cycles(tsc) 20.009 ns - 63 cycles(tsc) 15.929 ns improved 21.3%
128 - 94 cycles(tsc) 23.749 ns - 86 cycles(tsc) 21.583 ns improved 8.5%
158 - 97 cycles(tsc) 24.299 ns - 90 cycles(tsc) 22.552 ns improved 7.2%
250 - 102 cycles(tsc) 25.681 ns - 98 cycles(tsc) 24.589 ns improved 3.9%
Benchmarking shows impressive improvements in the "fastpath" with a small
number of objects in the working set. Once the working set increases,
resulting in activating the "slowpath" (that contains the heavier locked
cmpxchg_double) the improvement decreases.
I'm currently working on also optimizing the "slowpath" (as network stack
use-case hits this), but this patchset should provide a good foundation
for further improvements. Rest of my patch queue in this area needs some
more work, but preliminary results are good. I'm attending Netfilter
Workshop[2] next week, and I'll hopefully return working on further
improvements in this area.
This patch (of 6):
s/succedd/succeed/
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
||
|
|
2f064f3485 |
mm: make page pfmemalloc check more robust
Commit |
||
|
|
34cc6990d4 |
slab: correct size_index table before replacing the bootstrap kmem_cache_node
This patch moves the initialization of the size_index table slightly earlier so that the first few kmem_cache_node's can be safely allocated when KMALLOC_MIN_SIZE is large. There are currently two ways to generate indices into kmalloc_caches (via kmalloc_index() and via the size_index table in slab_common.c) and on some arches (possibly only MIPS) they potentially disagree with each other until create_kmalloc_caches() has been called. It seems that the intention is that the size_index table is a fast equivalent to kmalloc_index() and that create_kmalloc_caches() patches the table to return the correct value for the cases where kmalloc_index()'s if-statements apply. The failing sequence was: * kmalloc_caches contains NULL elements * kmem_cache_init initialises the element that 'struct kmem_cache_node' will be allocated to. For 32-bit Mips, this is a 56-byte struct and kmalloc_index returns KMALLOC_SHIFT_LOW (7). * init_list is called which calls kmalloc_node to allocate a 'struct kmem_cache_node'. * kmalloc_slab selects the kmem_caches element using size_index[size_index_elem(size)]. For MIPS, size is 56, and the expression returns 6. * This element of kmalloc_caches is NULL and allocation fails. * If it had not already failed, it would have called create_kmalloc_caches() at this point which would have changed size_index[size_index_elem(size)] to 7. I don't believe the bug to be LLVM specific but GCC doesn't normally encounter the problem. I haven't been able to identify exactly what GCC is doing better (probably inlining) but it seems that GCC is managing to optimize to the point that it eliminates the problematic allocations. This theory is supported by the fact that GCC can be made to fail in the same way by changing inline, __inline, __inline__, and __always_inline in include/linux/compiler-gcc.h such that they don't actually inline things. Signed-off-by: Daniel Sanders <daniel.sanders@imgtec.com> Acked-by: Pekka Enberg <penberg@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
4db0c3c298 |
mm: remove rest of ACCESS_ONCE() usages
We converted some of the usages of ACCESS_ONCE to READ_ONCE in the mm/ tree since it doesn't work reliably on non-scalar types. This patch removes the rest of the usages of ACCESS_ONCE, and use the new READ_ONCE API for the read accesses. This makes things cleaner, instead of using separate/multiple sets of APIs. Signed-off-by: Jason Low <jason.low2@hp.com> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Rik van Riel <riel@redhat.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
6f6528a163 |
slub: use bool function return values of true/false not 1/0
Use the normal return values for bool functions Signed-off-by: Joe Perches <joe@perches.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
08303a73c6 |
mm/slub.c: parse slub_debug O option in switch statement
By moving the O option detection into the switch statement, we allow this parameter to be combined with other options correctly. Previously options like slub_debug=OFZ would only detect the 'o' and use DEBUG_DEFAULT_FLAGS to fill in the rest of the flags. Signed-off-by: Chris J Arges <chris.j.arges@canonical.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
859b7a0e89 |
mm/slub: fix lockups on PREEMPT && !SMP kernels
Commit
|
||
|
|
0316bec22e |
mm: slub: add kernel address sanitizer support for slub allocator
With this patch kasan will be able to catch bugs in memory allocated by slub. Initially all objects in newly allocated slab page, marked as redzone. Later, when allocation of slub object happens, requested by caller number of bytes marked as accessible, and the rest of the object (including slub's metadata) marked as redzone (inaccessible). We also mark object as accessible if ksize was called for this object. There is some places in kernel where ksize function is called to inquire size of really allocated area. Such callers could validly access whole allocated memory, so it should be marked as accessible. Code in slub.c and slab_common.c files could validly access to object's metadata, so instrumentation for this files are disabled. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Signed-off-by: Dmitry Chernenkov <dmitryc@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
a79316c617 |
mm: slub: introduce metadata_access_enable()/metadata_access_disable()
It's ok for slub to access memory that marked by kasan as inaccessible (object's metadata). Kasan shouldn't print report in that case because these accesses are valid. Disabling instrumentation of slub.c code is not enough to achieve this because slub passes pointer to object's metadata into external functions like memchr_inv(). We don't want to disable instrumentation for memchr_inv() because this is quite generic function, and we don't want to miss bugs. metadata_access_enable/metadata_access_disable used to tell KASan where accesses to metadata starts/end, so we could temporarily disable KASan reports. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
75c66def8d |
mm: slub: share object_err function
Remove static and add function declarations to linux/slub_def.h so it could be used by kernel address sanitizer. Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Konstantin Serebryany <kcc@google.com> Cc: Dmitry Chernenkov <dmitryc@google.com> Signed-off-by: Andrey Konovalov <adech.fo@gmail.com> Cc: Yuri Gribov <tetra2005@gmail.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
5024c1d71b |
slub: use %*pb[l] to print bitmaps including cpumasks and nodemasks
printk and friends can now format bitmaps using '%*pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. * This is an equivalent conversion but the whole function should be converted to use scnprinf famiily of functions rather than performing custom output length predictions in multiple places. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
d6e0b7fa11 |
slub: make dead caches discard free slabs immediately
To speed up further allocations SLUB may store empty slabs in per cpu/node partial lists instead of freeing them immediately. This prevents per memcg caches destruction, because kmem caches created for a memory cgroup are only destroyed after the last page charged to the cgroup is freed. To fix this issue, this patch resurrects approach first proposed in [1]. It forbids SLUB to cache empty slabs after the memory cgroup that the cache belongs to was destroyed. It is achieved by setting kmem_cache's cpu_partial and min_partial constants to 0 and tuning put_cpu_partial() so that it would drop frozen empty slabs immediately if cpu_partial = 0. The runtime overhead is minimal. From all the hot functions, we only touch relatively cold put_cpu_partial(): we make it call unfreeze_partials() after freezing a slab that belongs to an offline memory cgroup. Since slab freezing exists to avoid moving slabs from/to a partial list on free/alloc, and there can't be allocations from dead caches, it shouldn't cause any overhead. We do have to disable preemption for put_cpu_partial() to achieve that though. The original patch was accepted well and even merged to the mm tree. However, I decided to withdraw it due to changes happening to the memcg core at that time. I had an idea of introducing per-memcg shrinkers for kmem caches, but now, as memcg has finally settled down, I do not see it as an option, because SLUB shrinker would be too costly to call since SLUB does not keep free slabs on a separate list. Besides, we currently do not even call per-memcg shrinkers for offline memcgs. Overall, it would introduce much more complexity to both SLUB and memcg than this small patch. Regarding to SLAB, there's no problem with it, because it shrinks per-cpu/node caches periodically. Thanks to list_lru reparenting, we no longer keep entries for offline cgroups in per-memcg arrays (such as memcg_cache_params->memcg_caches), so we do not have to bother if a per-memcg cache will be shrunk a bit later than it could be. [1] http://thread.gmane.org/gmane.linux.kernel.mm/118649/focus=118650 Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
ce3712d74d |
slub: fix kmem_cache_shrink return value
It is supposed to return 0 if the cache has no remaining objects and 1 otherwise, while currently it always returns 0. Fix it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
832f37f5d5 |
slub: never fail to shrink cache
SLUB's version of __kmem_cache_shrink() not only removes empty slabs, but also tries to rearrange the partial lists to place slabs filled up most to the head to cope with fragmentation. To achieve that, it allocates a temporary array of lists used to sort slabs by the number of objects in use. If the allocation fails, the whole procedure is aborted. This is unacceptable for the kernel memory accounting extension of the memory cgroup, where we want to make sure that kmem_cache_shrink() successfully discarded empty slabs. Although the allocation failure is utterly unlikely with the current page allocator implementation, which retries GFP_KERNEL allocations of order <= 2 infinitely, it is better not to rely on that. This patch therefore makes __kmem_cache_shrink() allocate the array on stack instead of calling kmalloc, which may fail. The array size is chosen to be equal to 32, because most SLUB caches store not more than 32 objects per slab page. Slab pages with <= 32 free objects are sorted using the array by the number of objects in use and promoted to the head of the partial list, while slab pages with > 32 free objects are left in the end of the list without any ordering imposed on them. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
426589f571 |
slab: link memcg caches of the same kind into a list
Sometimes, we need to iterate over all memcg copies of a particular root kmem cache. Currently, we use memcg_cache_params->memcg_caches array for that, because it contains all existing memcg caches. However, it's a bad practice to keep all caches, including those that belong to offline cgroups, in this array, because it will be growing beyond any bounds then. I'm going to wipe away dead caches from it to save space. To still be able to perform iterations over all memcg caches of the same kind, let us link them into a list. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|
|
f7ce3190c4 |
slab: embed memcg_cache_params to kmem_cache
Currently, kmem_cache stores a pointer to struct memcg_cache_params
instead of embedding it. The rationale is to save memory when kmem
accounting is disabled. However, the memcg_cache_params has shrivelled
drastically since it was first introduced:
* Initially:
struct memcg_cache_params {
bool is_root_cache;
union {
struct kmem_cache *memcg_caches[0];
struct {
struct mem_cgroup *memcg;
struct list_head list;
struct kmem_cache *root_cache;
bool dead;
atomic_t nr_pages;
struct work_struct destroy;
};
};
};
* Now:
struct memcg_cache_params {
bool is_root_cache;
union {
struct {
struct rcu_head rcu_head;
struct kmem_cache *memcg_caches[0];
};
struct {
struct mem_cgroup *memcg;
struct kmem_cache *root_cache;
};
};
};
So the memory saving does not seem to be a clear win anymore.
OTOH, keeping a pointer to memcg_cache_params struct instead of embedding
it results in touching one more cache line on kmem alloc/free hot paths.
Besides, it makes linking kmem caches in a list chained by a field of
struct memcg_cache_params really painful due to a level of indirection,
while I want to make them linked in the following patch. That said, let
us embed it.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|