* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
module: remove CONFIG_KMOD in comment after #endif
remove CONFIG_KMOD from fs
remove CONFIG_KMOD from drivers
Manually fix conflict due to include cleanups in drivers/md/md.c
de_thread() checks if the old leader was the ->child_reaper, this is not
possible any longer. With the previous patch ->group_leader itself will
change ->child_reaper on exit.
Henceforth find_new_reaper() is the only function (apart from
initialization) which plays with ->child_reaper.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Pavel Emelyanov <xemul@openvz.org>
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
With MAX_ARG_STRINGS set to 0x7FFFFFFF, and being passed to 'count()' and
compat_count(), it would appear that the current max bounds check of
fs/exec.c:394:
if(++i > max)
return -E2BIG;
would never trigger. Since 'i' is of type int, so values would wrap and the
function would continue looping.
Simple fix seems to be chaning ++i to i++ and checking for '>='.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Ollie Wild" <aaw@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Just always compile the code when the kernel is modular.
Convert load_nls to use try_then_request_module to tidy
up the code.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc/<pid>/<mmstats>
or ptrace or page migration.
CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails
The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.
Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.
Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.
Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.
Reported-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix compilation errors on avr32 and without CONFIG_SWAP, introduced by
ba92a43dba ("exec: remove some includes")
In file included from include/asm/tlb.h:24,
from fs/exec.c:55:
include/asm-generic/tlb.h: In function 'tlb_flush_mmu':
include/asm-generic/tlb.h:76: error: implicit declaration of function 'release_pages'
include/asm-generic/tlb.h: In function 'tlb_remove_page':
include/asm-generic/tlb.h:105: error: implicit declaration of function 'page_cache_release'
make[1]: *** [fs/exec.o] Error 1
This straightforward part-revert is nobody's favourite patch to address
the underlying tlb.h needs swap.h needs pagemap.h (but sparc won't like
that) mess; but appropriate to fix the build now before any overhaul.
Reported-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Reported-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Tested-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On Mon, May 19, 2008 at 12:01:49AM +0200, Marcin Slusarz wrote:
> open_exec is needlessly indented, calls ERR_PTR with 0 argument
> (which is not valid errno) and jumps into middle of function
> just to return value.
> So clean it up a bit.
Still looks rather messy. See below for a better version.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This moves all the ptrace hooks related to exec into tracehook.h inlines.
This also lifts the calls for tracing out of the binfmt load_binary hooks
into search_binary_handler() after it calls into the binfmt module. This
change has no effect, since all the binfmt modules' load_binary functions
did the call at the end on success, and now search_binary_handler() does
it immediately after return if successful. We consolidate the repeated
code, and binfmt modules no longer need to import ptrace_notify().
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I don't understand why the multi-thread coredump implies the core_uses_pid
behaviour, but we shouldn't use mm->mm_users for that. This counter can
be incremented by get_task_mm(). Use the valued returned by
coredump_wait() instead.
Also, remove the "const char *pattern" argument, format_corename() can use
core_pattern directly.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that we have core_state->dumper list we can use it to wake up the
sub-threads waiting for the coredump completion.
This uglifies the code and .text grows by 47 bytes, but otoh mm_struct
lessens by sizeof(struct completion). Also, with this change we can
decouple exit_mm() from the coredumping code.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
binfmt->core_dump() has to iterate over the all threads in system in order
to find the coredumping threads and construct the list using the
GFP_ATOMIC allocations.
With this patch each thread allocates the list node on exit_mm()'s stack and
adds itself to the list.
This allows us to do further changes:
- simplify ->core_dump()
- change exit_mm() to clear ->mm first, then wait for ->core_done.
this makes the coredumping process visible to oom_kill
- kill mm->core_done
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the "struct core_state core_state" from coredump_wait() to
do_coredump(), this makes mm->core_state visible to binfmt->core_dump().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change zap_process() to return int instead of incrementing
mm->core_state->nr_threads directly. Change zap_threads() to set
mm->core_state only on success.
This patch restores the original size of .text, and more importantly now
->nr_threads is used in two places only.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move mm->core_waiters into "struct core_state" allocated on stack. This
shrinks mm_struct a little bit and allows further changes.
This patch mostly does s/core_waiters/core_state. The only essential
change is that coredump_wait() must clear mm->core_state before return.
The coredump_wait()'s path is uglified and .text grows by 30 bytes, this
is fixed by the next patch.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm->core_startup_done points to "struct completion startup_done" allocated
on the coredump_wait()'s stack. Introduce the new structure, core_state,
which holds this "struct completion". This way we can add more info
visible to the threads participating in coredump without enlarging
mm_struct.
No changes in affected .o files.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The main loop in zap_threads() must skip kthreads which may use the same
mm. Otherwise we "kill" this thread erroneously (for example, it can not
fork or exec after that), and the coredumping task stucks in the
TASK_UNINTERRUPTIBLE state forever because of the wrong ->core_waiters
count.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the new PF_KTHREAD flag to mark the kernel threads. It is set
by INIT_TASK() and copied to the forked childs (we could set it in
kthreadd() along with PF_NOFREEZE instead).
daemonize() was changed as well. In that case testing of PF_KTHREAD is
racy, but daemonize() is hopeless anyway.
This flag is cleared in do_execve(), before search_binary_handler().
Probably not the best place, we can do this in exec_mmap() or in
start_thread(), or clear it along with PF_FORKNOEXEC. But I think this
doesn't matter in practice, and if do_execve() fails kthread should die
soon.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No changes in fs/exec.o
The for_each_process() loop in zap_threads() is very subtle, it is not
clear why we don't race with fork/exit/exec. Add the fat comment.
Also, change the code to use while_each_thread().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/exec.c used to need mman.h pagemap.h swap.h and rmap.h when it did
mm-ish stuff in install_arg_page(); but no need for them after 2.6.22.
[akpm@linux-foundation.org: unbreak arm]
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>