This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by
splitting module_core and module_init into three logical parts
each and setting appropriate page access permissions for each
individual section:
1. Code: RO+X
2. RO data: RO+NX
3. RW data: RW+NX
In order to achieve proper protection, layout_sections() have
been modified to align each of the three parts mentioned above
onto page boundary. Next, the corresponding page access
permissions are set right before successful exit from
load_module(). Further, free_module() and sys_init_module have
been modified to set module_core and module_init as RW+NX right
before calling module_free().
By default, the original section layout and access flags are
preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y,
the patch will page-align each group of sections to ensure that
each page contains only one type of content and will enforce
RO/NX for each group of pages.
-v1: Initial proof-of-concept patch.
-v2: The patch have been re-written to reduce the number of #ifdefs
and to make it architecture-agnostic. Code formatting has also
been corrected.
-v3: Opportunistic RO/NX protection is now unconditional. Section
page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
-v4: Removed most macros and improved coding style.
-v5: Changed page-alignment and RO/NX section size calculation
-v6: Fixed comments. Restricted RO/NX enforcement to x86 only
-v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added
calls to set_all_modules_text_rw() and set_all_modules_text_ro()
in ftrace
-v8: updated for compatibility with linux 2.6.33-rc5
-v9: coding style fixes
-v10: more coding style fixes
-v11: minor adjustments for -tip
-v12: minor adjustments for v2.6.35-rc2-tip
-v13: minor adjustments for v2.6.37-rc1-tip
Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com>
Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: James Morris <jmorris@namei.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Dave Jones <davej@redhat.com>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <4CE2F914.9070106@free.fr>
[ minor cleanliness edits, -v14: build failure fix ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
With all the recent module loading cleanups, we've minimized the code
that sits under module_mutex, fixing various deadlocks and making it
possible to do most of the module loading in parallel.
However, that whole conversion totally missed the rather obscure code
that adds a new module to the list for BUG() handling. That code was
doubly obscure because (a) the code itself lives in lib/bugs.c (for
dubious reasons) and (b) it gets called from the architecture-specific
"module_finalize()" rather than from generic code.
Calling it from arch-specific code makes no sense what-so-ever to begin
with, and is now actively wrong since that code isn't protected by the
module loading lock any more.
So this commit moves the "module_bug_{finalize,cleanup}()" calls away
from the arch-specific code, and into the generic code - and in the
process protects it with the module_mutex so that the list operations
are now safe.
Future fixups:
- move the module list handling code into kernel/module.c where it
belongs.
- get rid of 'module_bug_list' and just use the regular list of modules
(called 'modules' - imagine that) that we already create and maintain
for other reasons.
Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Adrian Bunk <bunk@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jason Baron <jbaron@redhat.com>
LKML-Reference: <ee8b3595967989fdaf84e698dc7447d315ce972a.1284733808.git.jbaron@redhat.com>
[ cleaned up some formating ]
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
These were placed in the header in ef665c1a06 to get the various
SYSFS/MODULE config combintations to compile.
That may have been necessary then, but it's not now. These functions
are all local to module.c.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
When adding a module that depends on another one, we used to create a
one-way list of "modules_which_use_me", so that module unloading could
see who needs a module.
It's actually quite simple to make that list go both ways: so that we
not only can see "who uses me", but also see a list of modules that are
"used by me".
In fact, we always wanted that list in "module_unload_free()": when we
unload a module, we want to also release all the other modules that are
used by that module. But because we didn't have that list, we used to
first iterate over all modules, and then iterate over each "used by me"
list of that module.
By making the list two-way, we simplify module_unload_free(), and it
allows for some trivial fixes later too.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (cleaned & rebased)
Module refcounting is implemented with a per-cpu counter for speed.
However there is a race when tallying the counter where a reference may
be taken by one CPU and released by another. Reference count summation
may then see the decrement without having seen the previous increment,
leading to lower than expected count. A module which never has its
actual reference drop below 1 may return a reference count of 0 due to
this race.
Module removal generally runs under stop_machine, which prevents this
race causing bugs due to removal of in-use modules. However there are
other real bugs in module.c code and driver code (module_refcount is
exported) where the callers do not run under stop_machine.
Fix this by maintaining running per-cpu counters for the number of
module refcount increments and the number of refcount decrements. The
increments are tallied after the decrements, so any decrement seen will
always have its corresponding increment counted. The final refcount is
the difference of the total increments and decrements, preventing a
low-refcount from being returned.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove the @refcnt argument, because it has side-effects, and arguments with
side-effects are not skipped by the jump over disabled instrumentation and are
executed even when the tracepoint is disabled.
This was also causing a GPF as found by Randy Dunlap:
Subject: 2.6.33 GP fault only when built with tracing
LKML-Reference: <4BA2B69D.3000309@oracle.com>
Note, the current 2.6.34-rc has a fix for the actual cause of the GPF,
but this fixes one of its triggers.
Tested-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <4BA97FA7.6040406@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Fix build for CONFIG_MODULES not enabled by providing a stub
for is_module_percpu_address().
kernel/lockdep.c:605: error: implicit declaration of function 'is_module_percpu_address'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
lockdep has custom code to check whether a pointer belongs to static
percpu area which is somewhat broken. Implement proper
is_kernel/module_percpu_address() and replace the custom code.
On UP, percpu variables are regular static variables and can't be
distinguished from them. Always return %false on UP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Ingo Molnar <mingo@redhat.com>
Better encapsulate module static percpu area handling so that code
outsidef of CONFIG_SMP ifdef doesn't deal with mod->percpu directly
and add mod->percpu_size and record percpu_size in it. Both percpu
fields are compiled out on UP. While at it, mark mod->percpu w/
__percpu.
This is to prepare for is_module_percpu_address().
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Extern declarations in sysctl.c should be moved to their own header file,
and then include them in relavant .c files.
Move modprobe_path extern declaration to linux/kmod.h
Move modules_disabled extern declaration to linux/module.h
Signed-off-by: Dave Young <hidave.darkstar@gmail.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ringbuffer*.c are the last users of local.h.
Remove the include from modules.h and add it to ringbuffer files.
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Use cpu ops to deal with the per cpu data instead of a local_t. Reduces memory
requirements, cache footprint and decreases cycle counts.
The this_cpu_xx operations are also used for !SMP mode. Otherwise we could
not drop the use of __module_ref_addr() which would make per cpu data handling
complicated. this_cpu_xx operations have their own fallback for !SMP.
V8-V9:
- Leave include asm/module.h since ringbuffer.c depends on it. Nothing else
does though. Another patch will deal with that.
- Remove spurious free.
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Tejun Heo <tj@kernel.org>
The next commit will require the use of MODULE_SYMBOL_PREFIX in
.tmp_exports-asm.S. Currently it is mixed in with C structure
definitions in "asm/module.h". Move the definition of this arch option
into Kconfig, so it can be easily accessed by any code.
This also lets modpost.c use the same definition. Previously modpost
relied on a hardcoded list of architectures in mk_elfconfig.c.
A build test for blackfin, one of the two MODULE_SYMBOL_PREFIX archs,
showed the generated code was unchanged. vmlinux was identical save
for build ids, and an apparently randomized suffix on a single "__key"
symbol in the kallsyms data).
Signed-off-by: Alan Jenkins <alan-jenkins@tuffmail.co.uk>
Acked-by: Mike Frysinger <vapier@gentoo.org> (blackfin)
CC: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Also remove all parts of the string table (referenced by the symbol
table) that are not needed for kallsyms use (i.e. which were only
referenced by symbols discarded by the previous patch, or not
referenced at all for whatever reason).
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Discard all symbols not interesting for kallsyms use: absolute,
section, and in the common case (!KALLSYMS_ALL) data ones.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>