Commit Graph

516 Commits

Author SHA1 Message Date
Pekka Enberg f4178cdddd Merge branch 'slab/common-for-cgroups' into slab/for-linus
Fix up a trivial conflict with NUMA_NO_NODE cleanups.

Conflicts:
	mm/slob.c

Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-10-03 09:56:37 +03:00
Pekka Enberg 023dc70470 Merge branch 'slab/next' into slab/for-linus 2012-10-03 09:56:12 +03:00
Fengguang Wu 788e1aadad slub: init_kmem_cache_cpus() and put_cpu_partial() can be static
Acked-by: Glauber Costa <glommer@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-10-03 09:48:41 +03:00
Ezequiel Garcia 2b847c3cb4 mm, slub: Rename slab_alloc() -> slab_alloc_node() to match SLAB
This patch does not fix anything, and its only goal is to enable us
to obtain some common code between SLAB and SLUB.
Neither behavior nor produced code is affected.

Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Ezequiel Garcia <elezegarcia@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-25 10:18:37 +03:00
Dave Jones 645df230ca mm, sl[au]b: Taint kernel when we detect a corrupted slab
It doesn't seem worth adding a new taint flag for this, so just re-use
the one from 'bad page'

Acked-by: Christoph Lameter <cl@linux.com> # SLUB
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-19 10:08:01 +03:00
Joonsoo Kim 8ba00bb68a slub: consider pfmemalloc_match() in get_partial_node()
get_partial() is currently not checking pfmemalloc_match() meaning that
it is possible for pfmemalloc pages to leak to non-pfmemalloc users.
This is a problem in the following situation.  Assume that there is a
request from normal allocation and there are no objects in the per-cpu
cache and no node-partial slab.

In this case, slab_alloc enters the slow path and new_slab_objects() is
called which may return a PFMEMALLOC page.  As the current user is not
allowed to access PFMEMALLOC page, deactivate_slab() is called
([5091b74a: mm: slub: optimise the SLUB fast path to avoid pfmemalloc
checks]) and returns an object from PFMEMALLOC page.

Next time, when we get another request from normal allocation,
slab_alloc() enters the slow-path and calls new_slab_objects().  In
new_slab_objects(), we call get_partial() and get a partial slab which
was just deactivated but is a pfmemalloc page.  We extract one object
from it and re-deactivate.

  "deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.

As a result, access to PFMEMALLOC page is not properly restricted and it
can cause a performance degradation due to frequent deactivation.
deactivation frequently.

This patch changes get_partial_node() to take pfmemalloc_match() into
account and prevents the "deactivate -> re-get in get_partial()
scenario.  Instead, new_slab() is called.

Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-09-17 15:00:38 -07:00
Christoph Lameter 9df53b154a slub: Zero initial memory segment for kmem_cache and kmem_cache_node
Tony Luck reported the following problem on IA-64:

  Worked fine yesterday on next-20120905, crashes today. First sign of
  trouble was an unaligned access, then a NULL dereference. SL*B related
  bits of my config:

  CONFIG_SLUB_DEBUG=y
  # CONFIG_SLAB is not set
  CONFIG_SLUB=y
  CONFIG_SLABINFO=y
  # CONFIG_SLUB_DEBUG_ON is not set
  # CONFIG_SLUB_STATS is not set

  And he console log.

  PID hash table entries: 4096 (order: 1, 32768 bytes)
  Dentry cache hash table entries: 262144 (order: 7, 2097152 bytes)
  Inode-cache hash table entries: 131072 (order: 6, 1048576 bytes)
  Memory: 2047920k/2086064k available (13992k code, 38144k reserved,
  6012k data, 880k init)
  kernel unaligned access to 0xca2ffc55fb373e95, ip=0xa0000001001be550
  swapper[0]: error during unaligned kernel access
   -1 [1]
  Modules linked in:

  Pid: 0, CPU 0, comm:              swapper
  psr : 00001010084a2018 ifs : 800000000000060f ip  :
  [<a0000001001be550>]    Not tainted (3.6.0-rc4-zx1-smp-next-20120906)
  ip is at new_slab+0x90/0x680
  unat: 0000000000000000 pfs : 000000000000060f rsc : 0000000000000003
  rnat: 9666960159966a59 bsps: a0000001001441c0 pr  : 9666960159965a59
  ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
  csd : 0000000000000000 ssd : 0000000000000000
  b0  : a0000001001be500 b6  : a00000010112cb20 b7  : a0000001011660a0
  f6  : 0fff7f0f0f0f0e54f0000 f7  : 0ffe8c5c1000000000000
  f8  : 1000d8000000000000000 f9  : 100068800000000000000
  f10 : 10005f0f0f0f0e54f0000 f11 : 1003e0000000000000078
  r1  : a00000010155eef0 r2  : 0000000000000000 r3  : fffffffffffc1638
  r8  : e0000040600081b8 r9  : ca2ffc55fb373e95 r10 : 0000000000000000
  r11 : e000004040001646 r12 : a000000101287e20 r13 : a000000101280000
  r14 : 0000000000004000 r15 : 0000000000000078 r16 : ca2ffc55fb373e75
  r17 : e000004040040000 r18 : fffffffffffc1646 r19 : e000004040001646
  r20 : fffffffffffc15f8 r21 : 000000000000004d r22 : a00000010132fa68
  r23 : 00000000000000ed r24 : 0000000000000000 r25 : 0000000000000000
  r26 : 0000000000000001 r27 : a0000001012b8500 r28 : a00000010135f4a0
  r29 : 0000000000000000 r30 : 0000000000000000 r31 : 0000000000000001
  Unable to handle kernel NULL pointer dereference (address
  0000000000000018)
  swapper[0]: Oops 11003706212352 [2]
  Modules linked in:

  Pid: 0, CPU 0, comm:              swapper
  psr : 0000121008022018 ifs : 800000000000cc18 ip  :
  [<a0000001004dc8f1>]    Not tainted (3.6.0-rc4-zx1-smp-next-20120906)
  ip is at __copy_user+0x891/0x960
  unat: 0000000000000000 pfs : 0000000000000813 rsc : 0000000000000003
  rnat: 0000000000000000 bsps: 0000000000000000 pr  : 9666960159961765
  ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c0270033f
  csd : 0000000000000000 ssd : 0000000000000000
  b0  : a00000010004b550 b6  : a00000010004b740 b7  : a00000010000c750
  f6  : 000000000000000000000 f7  : 1003e9e3779b97f4a7c16
  f8  : 1003e0a00000010001550 f9  : 100068800000000000000
  f10 : 10005f0f0f0f0e54f0000 f11 : 1003e0000000000000078
  r1  : a00000010155eef0 r2  : a0000001012870b0 r3  : a0000001012870b8
  r8  : 0000000000000298 r9  : 0000000000000013 r10 : 0000000000000000
  r11 : 9666960159961a65 r12 : a000000101287010 r13 : a000000101280000
  r14 : a000000101287068 r15 : a000000101287080 r16 : 0000000000000298
  r17 : 0000000000000010 r18 : 0000000000000018 r19 : a000000101287310
  r20 : 0000000000000290 r21 : 0000000000000000 r22 : 0000000000000000
  r23 : a000000101386f58 r24 : 0000000000000000 r25 : 000000007fffffff
  r26 : a000000101287078 r27 : a0000001013c69b0 r28 : 0000000000000000
  r29 : 0000000000000014 r30 : 0000000000000000 r31 : 0000000000000813

Sedat Dilek and Hugh Dickins reported similar problems as well.

Earlier patches in the common set moved the zeroing of the kmem_cache
structure into common code. See "Move allocation of kmem_cache into
common code".

The allocation for the two special structures is still done from SLUB
specific code but no zeroing is done since the cache creation functions
used to zero. This now needs to be updated so that the structures are
zeroed during allocation in kmem_cache_init().  Otherwise random pointer
values may be followed.

Reported-by: Tony Luck <tony.luck@intel.com>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Reported-by: Hugh Dickins <hughd@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-10 09:57:19 +03:00
Pekka Enberg aac3a1664a Revert "mm/sl[aou]b: Move sysfs_slab_add to common"
This reverts commit 96d17b7be0 which
caused the following errors at boot:

  [    1.114885] kobject (ffff88001a802578): tried to init an initialized object, something is seriously wrong.
  [    1.114885] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
  [    1.114885] Call Trace:
  [    1.114885]  [<ffffffff81273f37>] kobject_init+0x87/0xa0
  [    1.115555]  [<ffffffff8127426a>] kobject_init_and_add+0x2a/0x90
  [    1.115555]  [<ffffffff8127c870>] ? sprintf+0x40/0x50
  [    1.115555]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
  [    1.115555]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
  [    1.115555]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
  [    1.115555]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
  [    1.115555]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
  [    1.115836]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
  [    1.116834]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
  [    1.117835]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
  [    1.117835]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
  [    1.118401]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
  [    1.119832]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
  [    1.120325] ------------[ cut here ]------------
  [    1.120835] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xc1/0xf0()
  [    1.121437] sysfs: cannot create duplicate filename '/kernel/slab/:t-0000016'
  [    1.121831] Modules linked in:
  [    1.122138] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
  [    1.122831] Call Trace:
  [    1.123074]  [<ffffffff81195ce1>] ? sysfs_add_one+0xc1/0xf0
  [    1.123833]  [<ffffffff8103adfa>] warn_slowpath_common+0x7a/0xb0
  [    1.124405]  [<ffffffff8103aed1>] warn_slowpath_fmt+0x41/0x50
  [    1.124832]  [<ffffffff81195ce1>] sysfs_add_one+0xc1/0xf0
  [    1.125337]  [<ffffffff81195eb3>] create_dir+0x73/0xd0
  [    1.125832]  [<ffffffff81196221>] sysfs_create_dir+0x81/0xe0
  [    1.126363]  [<ffffffff81273d3d>] kobject_add_internal+0x9d/0x210
  [    1.126832]  [<ffffffff812742a3>] kobject_init_and_add+0x63/0x90
  [    1.127406]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
  [    1.127832]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
  [    1.128384]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
  [    1.128833]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
  [    1.129831]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
  [    1.130305]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
  [    1.130831]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
  [    1.131351]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
  [    1.131830]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
  [    1.132392]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
  [    1.132830]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
  [    1.133315] ---[ end trace 2703540871c8fab7 ]---
  [    1.133830] ------------[ cut here ]------------
  [    1.134274] WARNING: at lib/kobject.c:196 kobject_add_internal+0x1f5/0x210()
  [    1.134829] kobject_add_internal failed for :t-0000016 with -EEXIST, don't try to register things with the same name in the same directory.
  [    1.135829] Modules linked in:
  [    1.136135] Pid: 1, comm: swapper/0 Tainted: G        W    3.6.0-rc1+ #6
  [    1.136828] Call Trace:
  [    1.137071]  [<ffffffff81273e95>] ? kobject_add_internal+0x1f5/0x210
  [    1.137830]  [<ffffffff8103adfa>] warn_slowpath_common+0x7a/0xb0
  [    1.138402]  [<ffffffff8103aed1>] warn_slowpath_fmt+0x41/0x50
  [    1.138830]  [<ffffffff811955a3>] ? release_sysfs_dirent+0x73/0xf0
  [    1.139419]  [<ffffffff81273e95>] kobject_add_internal+0x1f5/0x210
  [    1.139830]  [<ffffffff812742a3>] kobject_init_and_add+0x63/0x90
  [    1.140429]  [<ffffffff81124c60>] sysfs_slab_add+0x80/0x210
  [    1.140830]  [<ffffffff81100175>] kmem_cache_create+0xa5/0x250
  [    1.141829]  [<ffffffff81cf24cd>] ? md_init+0x144/0x144
  [    1.142307]  [<ffffffff81cf25b6>] local_init+0xa4/0x11b
  [    1.142829]  [<ffffffff81cf24e1>] dm_init+0x14/0x45
  [    1.143307]  [<ffffffff810001ba>] do_one_initcall+0x3a/0x160
  [    1.143829]  [<ffffffff81cc2c90>] kernel_init+0x133/0x1b7
  [    1.144352]  [<ffffffff81cc25c4>] ? do_early_param+0x86/0x86
  [    1.144829]  [<ffffffff8171aff4>] kernel_thread_helper+0x4/0x10
  [    1.145405]  [<ffffffff81cc2b5d>] ? start_kernel+0x33f/0x33f
  [    1.145828]  [<ffffffff8171aff0>] ? gs_change+0xb/0xb
  [    1.146313] ---[ end trace 2703540871c8fab8 ]---

Conflicts:

	mm/slub.c

Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:07:44 +03:00
Christoph Lameter cce89f4f69 mm/sl[aou]b: Move kmem_cache refcounting to common code
Get rid of the refcount stuff in the allocators and do that part of
kmem_cache management in the common code.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:37 +03:00
Christoph Lameter 8a13a4cc80 mm/sl[aou]b: Shrink __kmem_cache_create() parameter lists
Do the initial settings of the fields in common code. This will allow us
to push more processing into common code later and improve readability.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:37 +03:00
Christoph Lameter 278b1bb131 mm/sl[aou]b: Move kmem_cache allocations into common code
Shift the allocations to common code. That way the allocation and
freeing of the kmem_cache structures is handled by common code.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter 96d17b7be0 mm/sl[aou]b: Move sysfs_slab_add to common
Simplify locking by moving the slab_add_sysfs after all locks have been
dropped. Eases the upcoming move to provide sysfs support for all
allocators.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter cbb79694d5 mm/sl[aou]b: Do slab aliasing call from common code
The slab aliasing logic causes some strange contortions in slub. So add
a call to deal with aliases to slab_common.c but disable it for other
slab allocators by providng stubs that fail to create aliases.

Full general support for aliases will require additional cleanup passes
and more standardization of fields in kmem_cache.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter db265eca77 mm/sl[aou]b: Move duping of slab name to slab_common.c
Duping of the slabname has to be done by each slab. Moving this code to
slab_common avoids duplicate implementations.

With this patch we have common string handling for all slab allocators.
Strings passed to kmem_cache_create() are copied internally. Subsystems
can create temporary strings to create slab caches.

Slabs allocated in early states of bootstrap will never be freed (and
those can never be freed since they are essential to slab allocator
operations).  During bootstrap we therefore do not have to worry about
duping names.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter 12c3667fb7 mm/sl[aou]b: Get rid of __kmem_cache_destroy
What is done there can be done in __kmem_cache_shutdown.

This affects RCU handling somewhat. On rcu free all slab allocators do
not refer to other management structures than the kmem_cache structure.
Therefore these other structures can be freed before the rcu deferred
free to the page allocator occurs.

Reviewed-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter 8f4c765c22 mm/sl[aou]b: Move freeing of kmem_cache structure to common code
The freeing action is basically the same in all slab allocators.
Move to the common kmem_cache_destroy() function.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter 9b030cb865 mm/sl[aou]b: Use "kmem_cache" name for slab cache with kmem_cache struct
Make all allocators use the "kmem_cache" slabname for the "kmem_cache"
structure.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:36 +03:00
Christoph Lameter 945cf2b619 mm/sl[aou]b: Extract a common function for kmem_cache_destroy
kmem_cache_destroy does basically the same in all allocators.

Extract common code which is easy since we already have common mutex
handling.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:35 +03:00
Christoph Lameter 7c9adf5a54 mm/sl[aou]b: Move list_add() to slab_common.c
Move the code to append the new kmem_cache to the list of slab caches to
the kmem_cache_create code in the shared code.

This is possible now since the acquisition of the mutex was moved into
kmem_cache_create().

Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: Glauber Costa <glommer@parallels.com>
Reviewed-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:35 +03:00
Christoph Lameter 208c4358dc mm/slub: Use kmem_cache for the kmem_cache structure
Do not use kmalloc() but kmem_cache_alloc() for the allocation
of the kmem_cache structures in slub.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:35 +03:00
Christoph Lameter 79576102af mm/slub: Add debugging to verify correct cache use on kmem_cache_free()
Add additional debugging to check that the objects is actually from the cache
the caller claims. Doing so currently trips up some other debugging code. It
takes a lot to infer from that what was happening.

Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
[ penberg@kernel.org: Use pr_err() ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-09-05 12:00:35 +03:00
Joonsoo Kim e24fc410f5 slub: reduce failure of this_cpu_cmpxchg in put_cpu_partial() after unfreezing
In current implementation, after unfreezing, we doesn't touch oldpage,
so it remain 'NOT NULL'. When we call this_cpu_cmpxchg()
with this old oldpage, this_cpu_cmpxchg() is mostly be failed.

We can change value of oldpage to NULL after unfreezing,
because unfreeze_partial() ensure that all the cpu partial slabs is removed
from cpu partial list. In this time, we could expect that
this_cpu_cmpxchg is mostly succeed.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-08-16 10:06:42 +03:00
Christoph Lameter 19c7ff9ecd slub: Take node lock during object free checks
Only applies to scenarios where debugging is on:

Validation of slabs can currently occur while debugging
information is updated from the fast paths of the allocator.
This results in various races where we get false reports about
slab metadata not being in order.

This patch makes the fast paths take the node lock so that
serialization with slab validation will occur. Causes additional
slowdown in debug scenarios.

Reported-by: Waiman Long <Waiman.Long@hp.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-08-16 09:45:04 +03:00
Glauber Costa d9b7f22623 slub: use free_page instead of put_page for freeing kmalloc allocation
When freeing objects, the slub allocator will most of the time free
empty pages by calling __free_pages(). But high-order kmalloc will be
diposed by means of put_page() instead. It makes no sense to call
put_page() in kernel pages that are provided by the object allocators,
so we shouldn't be doing this ourselves. Aside from the consistency
change, we don't change the flow too much. put_page()'s would call its
dtor function, which is __free_pages. We also already do all of the
Compound page tests ourselves, and the Mlock test we lose don't really
matter.

Signed-off-by: Glauber Costa <glommer@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
CC: David Rientjes <rientjes@google.com>
CC: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2012-08-16 09:25:03 +03:00
Christoph Lameter 5091b74a95 mm: slub: optimise the SLUB fast path to avoid pfmemalloc checks
This patch removes the check for pfmemalloc from the alloc hotpath and
puts the logic after the election of a new per cpu slab.  For a pfmemalloc
page we do not use the fast path but force the use of the slow path which
is also used for the debug case.

This has the side-effect of weakening pfmemalloc processing in the
following way;

1. A process that is allocating for network swap calls __slab_alloc.
   pfmemalloc_match is true so the freelist is loaded and c->freelist is
   now pointing to a pfmemalloc page.

2. A process that is attempting normal allocations calls slab_alloc,
   finds the pfmemalloc page on the freelist and uses it because it did
   not check pfmemalloc_match()

The patch allows non-pfmemalloc allocations to use pfmemalloc pages with
the kmalloc slabs being the most vunerable caches on the grounds they
are most likely to have a mix of pfmemalloc and !pfmemalloc requests. A
later patch will still protect the system as processes will get throttled
if the pfmemalloc reserves get depleted but performance will not degrade
as smoothly.

[mgorman@suse.de: Expanded changelog]
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: David Miller <davem@davemloft.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-31 18:42:45 -07:00