Commit Graph

83 Commits

Author SHA1 Message Date
Christoph Lameter
2b71244285 percpu: Generic this_cpu_cmpxchg() and this_cpu_xchg support
Generic code to provide new per cpu atomic features

	this_cpu_cmpxchg
	this_cpu_xchg

Fallback occurs to functions using interrupts disable/enable
to ensure correct per cpu atomicity.

Fallback to regular cmpxchg and xchg is not possible since per cpu atomic
semantics include the guarantee that the current cpus per cpu data is
accessed atomically. Use of regular cmpxchg and xchg requires the
determination of the address of the per cpu data before regular cmpxchg
or xchg which therefore cannot be atomically included in an xchg or
cmpxchg without segment override.

tj: - Relocated new ops to conform better to the general organization.
    - This patch contains a trivial comment fix.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-18 15:54:04 +01:00
Tejun Heo
403047754c percpu,x86: relocate this_cpu_add_return() and friends
- include/linux/percpu.h: this_cpu_add_return() and friends were
  located next to __this_cpu_add_return().  However, the overall
  organization is to first group by preemption safeness.  Relocate
  this_cpu_add_return() and friends to preemption-safe area.

- arch/x86/include/asm/percpu.h: Relocate percpu_add_return_op() after
  other more basic operations.  Relocate [__]this_cpu_add_return_8()
  so that they're first grouped by preemption safeness.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux.com>
2010-12-17 16:13:22 +01:00
Christoph Lameter
a663ffff1d percpu: Generic support for this_cpu_add, sub, dec, inc_return
Introduce generic support for this_cpu_add_return etc.

The fallback is to realize these operations with simpler __this_cpu_ops.

tj: - Reformatted __cpu_size_call_return2() to make it more consistent
      with its neighbors.
    - Dropped unnecessary temp variable ret__ from
      __this_cpu_generic_add_return().

Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-12-17 15:15:28 +01:00
Linus Torvalds
0fc0531e0a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: update comments to reflect that percpu allocations are always zero-filled
  percpu: Optimize __get_cpu_var()
  x86, percpu: Optimize this_cpu_ptr
  percpu: clear memory allocated with the km allocator
  percpu: fix build breakage on s390 and cleanup build configuration tests
  percpu: use percpu allocator on UP too
  percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
  vmalloc: pcpu_get/free_vm_areas() aren't needed on UP

Fixed up trivial conflicts in include/linux/percpu.h
2010-10-22 17:31:36 -07:00
Peter Zijlstra
8b8e2ec1ee percpu: Add {get,put}_cpu_ptr
These are similar to {get,put}_cpu_var() except for dynamically
allocated per-cpu memory.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <20100917093009.252867712@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-21 13:55:43 +02:00
Tejun Heo
bbddff0545 percpu: use percpu allocator on UP too
On UP, percpu allocations were redirected to kmalloc.  This has the
following problems.

* For certain amount of allocations (determined by
  PERCPU_DYNAMIC_EARLY_SLOTS and PERCPU_DYNAMIC_EARLY_SIZE), percpu
  allocator can be used before the usual kernel memory allocator is
  brought online.  On SMP, this is used to initialize the kernel
  memory allocator.

* percpu allocator honors alignment upto PAGE_SIZE but kmalloc()
  doesn't.  For example, workqueue makes use of larger alignments for
  cpu_workqueues.

Currently, users of percpu allocators need to handle UP differently,
which is somewhat fragile and ugly.  Other than small amount of
memory, there isn't much to lose by enabling percpu allocator on UP.
It can simply use kernel memory based chunk allocation which was added
for SMP archs w/o MMUs.

This patch removes mm/percpu_up.c, builds mm/percpu.c on UP too and
makes UP build use percpu-km.  As percpu addresses and kernel
addresses are always identity mapped and static percpu variables don't
need any special treatment, nothing is arch dependent and mm/percpu.c
implements generic setup_per_cpu_areas() for UP.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-09-08 11:11:23 +02:00
Tejun Heo
6abad5acac percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
In preparation of enabling percpu allocator for UP, reduce
PCPU_MIN_UNIT_SIZE to 32k.  On UP, the first chunk doesn't have to
include static percpu variables and chunk size can be smaller which is
important as UP percpu allocator will use contiguous kernel memory to
populate chunks.

PCPU_MIN_UNIT_SIZE also determines the maximum supported allocation
size but 32k should still be enough.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux.com>
2010-09-08 11:11:12 +02:00
Namhyung Kim
18cb2aef91 percpu: handle __percpu notations in UP accessors
UP accessors didn't take care of __percpu notations leading to a lot
of spurious sparse warnings on UP configurations.  Fix it.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2010-08-07 14:20:53 +02:00
Tejun Heo
099a19d91c percpu: allow limited allocation before slab is online
This patch updates percpu allocator such that it can serve limited
amount of allocation before slab comes online.  This is primarily to
allow slab to depend on working percpu allocator.

Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
much memory space and allocation map slots are reserved.  If this
reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
will fail till slab comes online.

The following changes are made to implement early alloc.

* pcpu_mem_alloc() now checks slab_is_available()

* Chunks are allocated using pcpu_mem_alloc()

* Init paths make sure ai->dyn_size is at least as large as
  PERCPU_DYNAMIC_EARLY_SIZE.

* Initial alloc maps are allocated in __initdata and copied to
  kmalloc'd areas once slab is online.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
2010-06-27 18:50:00 +02:00
Tejun Heo
4ba6ce250e percpu: make @dyn_size always mean min dyn_size in first chunk init functions
In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was
ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum
size.  There's no use case for forcing 0 and the upcoming early alloc
support always requires non-zero dynamic size.  Make @dyn_size always
mean minimum dyn_size.

While at it, make pcpu_build_alloc_info() static which doesn't have
any external caller as suggested by David Rientjes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
2010-06-27 18:49:59 +02:00
Linus Torvalds
b66696e3c0 Merge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc
* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  eeepc-wmi: include slab.h
  staging/otus: include slab.h from usbdrv.h
  percpu: don't implicitly include slab.h from percpu.h
  kmemcheck: Fix build errors due to missing slab.h
  include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
  iwlwifi: don't include iwl-dev.h from iwl-devtrace.h
  x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h

Fix up trivial conflicts in include/linux/percpu.h due to
is_kernel_percpu_address() having been introduced since the slab.h
cleanup with the percpu_up.c splitup.
2010-04-05 09:39:11 -07:00
Tejun Heo
de380b55f9 percpu: don't implicitly include slab.h from percpu.h
percpu.h has always been including slab.h to get k[mz]alloc/free() for
UP inline implementation.  percpu.h being used by very low level
headers including module.h and sched.h, this meant that a lot files
unintentionally got slab.h inclusion.

Lee Schermerhorn was trying to make topology.h use percpu.h and got
bitten by this implicit inclusion.  The right thing to do is break
this ultimately unnecessary dependency.  The previous patch added
explicit inclusion of either gfp.h or slab.h to the source files using
them.  This patch updates percpu.h such that slab.h is no longer
included from percpu.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Tejun Heo
10fad5e46f percpu, module: implement and use is_kernel/module_percpu_address()
lockdep has custom code to check whether a pointer belongs to static
percpu area which is somewhat broken.  Implement proper
is_kernel/module_percpu_address() and replace the custom code.

On UP, percpu variables are regular static variables and can't be
distinguished from them.  Always return %false on UP.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
2010-03-29 23:07:12 +09:00
Tejun Heo
32032df6c2 Merge branch 'master' into percpu
Conflicts:
	arch/powerpc/platforms/pseries/hvCall.S
	include/linux/percpu.h
2010-01-05 09:17:33 +09:00
Tejun Heo
50de1a8ef1 Merge branch 'for-linus' into for-next
Conflicts:
	mm/percpu.c
2009-12-08 10:02:12 +09:00
Tejun Heo
ee0a6efc18 percpu: add missing per_cpu_ptr_to_phys() definition for UP
Commit 3b034b0d08 implemented
per_cpu_ptr_to_phys() but forgot to add UP definition.  Add UP
definition which is simple wrapper around __pa().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
2009-12-02 08:36:58 +09:00
Vivek Goyal
3b034b0d08 percpu: Fix kdump failure if booted with percpu_alloc=page
o kdump functionality reserves a per cpu area at boot time and exports the
  physical address of that area to user space through sys interface. This
  area stores some dump related information like cpu register states etc
  at the time of crash.

o We were assuming that per cpu area always come from linearly mapped meory
  region and using __pa() to determine physical address.
  With percpu_alloc=page, per cpu area can come from vmalloc region also and
  __pa() breaks.

o This patch implments a new function to convert per cpu address to
  physical address.

Before the patch, crash_notes addresses looked as follows.

cpu0 60fffff49800
cpu1 60fffff60800
cpu2 60fffff77800

These are bogus phsyical addresses.

After the patch, address are following.

cpu0 13eb44000
cpu1 13eb43000
cpu2 13eb42000
cpu3 13eb41000

These look fine. I got 4G of memory and /proc/iomem tell me following.

100000000-13fffffff : System RAM

tj: * added missing asm/io.h include reported by Stephen Rothwell
    * repositioned per_cpu_ptr_phys() in percpu.c and added comment.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2009-11-25 21:49:22 +09:00
Tejun Heo
545695fb41 percpu: make accessors check for percpu pointer in sparse
The previous patch made sparse warn about percpu variables being used
directly without going through percpu accessors.  This patch
implements the other half - checking whether non percpu variable is
passed into percpu accessors.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
2009-10-29 22:34:15 +09:00
Rusty Russell
e0fdb0e050 percpu: add __percpu for sparse.
We have to make __kernel "__attribute__((address_space(0)))" so we can
cast to it.

tj: * put_cpu_var() update.

    * Annotations added to dynamic allocator interface.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-29 22:34:15 +09:00
Tejun Heo
f7b64fe806 percpu: make access macros universal
Now that per_cpu__ prefix is gone, there's no distinction between
static and dynamic percpu variables.  Make get_cpu_var() take dynamic
percpu variables and ensure that all macros have parentheses around
the parameter evaluation and evaluate the variable parameter only once
such that any expression which evaluates to percpu address can be used
safely.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-29 22:34:15 +09:00
Rusty Russell
dd17c8f729 percpu: remove per_cpu__ prefix.
Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
      original patch.

    * Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
2009-10-29 22:34:15 +09:00
Tejun Heo
0f5e4816db percpu: remove some sparse warnings
Make the following changes to remove some sparse warnings.

* Make DEFINE_PER_CPU_SECTION() declare __pcpu_unique_* before
  defining it.

* Annotate pcpu_extend_area_map() that it is entered with pcpu_lock
  held, releases it and then reacquires it.

* Make percpu related macros use unique nested variable names.

* While at it, add pcpu prefix to __size_call[_return]() macros as
  to-be-implemented sparse annotations will add percpu specific stuff
  to these macros.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29 22:34:12 +09:00
Tejun Heo
64ef291f46 percpu: make alloc_percpu() handle array types
alloc_percpu() couldn't handle array types like "int [100]" due to the
way return type was casted.  Fix it by using typeof() instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
2009-10-29 22:34:12 +09:00
Christoph Lameter
7340a0b152 this_cpu: Introduce this_cpu_ptr() and generic this_cpu_* operations
This patch introduces two things: First this_cpu_ptr and then per cpu
atomic operations.

this_cpu_ptr
------------

A common operation when dealing with cpu data is to get the instance of the
cpu data associated with the currently executing processor. This can be
optimized by

this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).

The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
an array lookup to find the offset for the cpu. Processors typically
have the offset for the current cpu area in some kind of (arch dependent)
efficiently accessible register or memory location.

We can use that instead of doing the array lookup to speed up the
determination of the address of the percpu variable. This is particularly
significant because these lookups occur in performance critical paths
of the core kernel. this_cpu_ptr() can avoid memory accesses and

this_cpu_ptr comes in two flavors. The preemption context matters since we
are referring the the currently executing processor. In many cases we must
insure that the processor does not change while a code segment is executed.

__this_cpu_ptr 	-> Do not check for preemption context
this_cpu_ptr	-> Check preemption context

The parameter to these operations is a per cpu pointer. This can be the
address of a statically defined per cpu variable (&per_cpu_var(xxx)) or
the address of a per cpu variable allocated with the per cpu allocator.

per cpu atomic operations: this_cpu_*(var, val)
-----------------------------------------------
this_cpu_* operations (like this_cpu_add(struct->y, value) operate on
abitrary scalars that are members of structures allocated with the new
per cpu allocator. They can also operate on static per_cpu variables
if they are passed to per_cpu_var() (See patch to use this_cpu_*
operations for vm statistics).

These operations are guaranteed to be atomic vs preemption when modifying
the scalar. The calculation of the per cpu offset is also guaranteed to
be atomic at the same time. This means that a this_cpu_* operation can be
safely used to modify a per cpu variable in a context where interrupts are
enabled and preemption is allowed. Many architectures can perform such
a per cpu atomic operation with a single instruction.

Note that the atomicity here is different from regular atomic operations.
Atomicity is only guaranteed for data accessed from the currently executing
processor. Modifications from other processors are still possible. There
must be other guarantees that the per cpu data is not modified from another
processor when using these instruction. The per cpu atomicity is created
by the fact that the processor either executes and instruction or not.
Embedded in the instruction is the relocation of the per cpu address to
the are reserved for the current processor and the RMW action. Therefore
interrupts or preemption cannot occur in the mids of this processing.

Generic fallback functions are used if an arch does not define optimized
this_cpu operations. The functions come also come in the two flavors used
for this_cpu_ptr().

The firstparameter is a scalar that is a member of a structure allocated
through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The
operations are similar to what percpu_add() and friends do.

this_cpu_read(scalar)
this_cpu_write(scalar, value)
this_cpu_add(scale, value)
this_cpu_sub(scalar, value)
this_cpu_inc(scalar)
this_cpu_dec(scalar)
this_cpu_and(scalar, value)
this_cpu_or(scalar, value)
this_cpu_xor(scalar, value)

Arch code can override the generic functions and provide optimized atomic
per cpu operations. These atomic operations must provide both the relocation
(x86 does it through a segment override) and the operation on the data in a
single instruction. Otherwise preempt needs to be disabled and there is no
gain from providing arch implementations.

A third variant is provided prefixed by irqsafe_. These variants are safe
against hardware interrupts on the *same* processor (all per cpu atomic
primitives are *always* *only* providing safety for code running on the
*same* processor!). The increment needs to be implemented by the hardware
in such a way that it is a single RMW instruction that is either processed
before or after an interrupt.

cc: David Howells <dhowells@redhat.com>
cc: Ingo Molnar <mingo@elte.hu>
cc: Rusty Russell <rusty@rustcorp.com.au>
cc: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2009-10-03 19:48:22 +09:00
Tejun Heo
23fb064bb9 percpu: kill legacy percpu allocator
With ia64 converted, there's no arch left which still uses legacy
percpu allocator.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Delightedly-acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Lameter <cl@linux-foundation.org>
2009-10-02 13:29:29 +09:00