Commit Graph

9415 Commits

Author SHA1 Message Date
Andreas Schwab
46da276648 kernel/sys.c: fix compat uname machine
On ppc64 you get this error:

  $ setarch ppc -R true
  setarch: ppc: Unrecognized architecture

because uname still reports ppc64 as the machine.

So mask off the personality flags when checking for PER_LINUX32.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:24 -07:00
David Howells
e134d200d5 CRED: Fix a race in creds_are_invalid() in credentials debugging
creds_are_invalid() reads both cred->usage and cred->subscribers and then
compares them to make sure the number of processes subscribed to a cred struct
never exceeds the refcount of that cred struct.

The problem is that this can cause a race with both copy_creds() and
exit_creds() as the two counters, whilst they are of atomic_t type, are only
atomic with respect to themselves, and not atomic with respect to each other.

This means that if creds_are_invalid() can read the values on one CPU whilst
they're being modified on another CPU, and so can observe an evolving state in
which the subscribers count now is greater than the usage count a moment
before.

Switching the order in which the counts are read cannot help, so the thing to
do is to remove that particular check.

I had considered rechecking the values to see if they're in flux if the test
fails, but I can't guarantee they won't appear the same, even if they've
changed several times in the meantime.

Note that this can only happen if CONFIG_DEBUG_CREDENTIALS is enabled.

The problem is only likely to occur with multithreaded programs, and can be
tested by the tst-eintr1 program from glibc's "make check".  The symptoms look
like:

	CRED: Invalid credentials
	CRED: At include/linux/cred.h:240
	CRED: Specified credentials: ffff88003dda5878 [real][eff]
	CRED: ->magic=43736564, put_addr=(null)
	CRED: ->usage=766, subscr=766
	CRED: ->*uid = { 0,0,0,0 }
	CRED: ->*gid = { 0,0,0,0 }
	CRED: ->security is ffff88003d72f538
	CRED: ->security {359, 359}
	------------[ cut here ]------------
	kernel BUG at kernel/cred.c:850!
	...
	RIP: 0010:[<ffffffff81049889>]  [<ffffffff81049889>] __invalid_creds+0x4e/0x52
	...
	Call Trace:
	 [<ffffffff8104a37b>] copy_creds+0x6b/0x23f

Note the ->usage=766 and subscr=766.  The values appear the same because
they've been re-read since the check was made.

Reported-by: Roland McGrath <roland@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-22 09:14:29 +10:00
David Howells
eff30363c0 CRED: Fix double free in prepare_usermodehelper_creds() error handling
Patch 570b8fb505:

	Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
	Date:   Tue Mar 30 00:04:00 2010 +0100
	Subject: CRED: Fix memory leak in error handling

attempts to fix a memory leak in the error handling by making the offending
return statement into a jump down to the bottom of the function where a
kfree(tgcred) is inserted.

This is, however, incorrect, as it does a kfree() after doing put_cred() if
security_prepare_creds() fails.  That will result in a double free if 'error'
is jumped to as put_cred() will also attempt to free the new tgcred record by
virtue of it being pointed to by the new cred record.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2010-04-21 09:20:35 +10:00
Paul E. McKenney
bc293d62b2 rcu: Make RCU lockdep check the lockdep_recursion variable
The lockdep facility temporarily disables lockdep checking by
incrementing the current->lockdep_recursion variable.  Such
disabling happens in NMIs and in other situations where lockdep
might expect to recurse on itself.

This patch therefore checks current->lockdep_recursion, disabling RCU
lockdep splats when this variable is non-zero.  In addition, this patch
removes the "likely()", as suggested by Lai Jiangshan.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Reported-by: David Miller <davem@davemloft.net>
Tested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: laijs@cn.fujitsu.com
Cc: dipankar@in.ibm.com
Cc: mathieu.desnoyers@polymtl.ca
Cc: josh@joshtriplett.org
Cc: dvhltc@us.ibm.com
Cc: niv@us.ibm.com
Cc: peterz@infradead.org
Cc: rostedt@goodmis.org
Cc: Valdis.Kletnieks@vt.edu
Cc: dhowells@redhat.com
Cc: eric.dumazet@gmail.com
LKML-Reference: <20100415195039.GA22623@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-19 08:37:19 +02:00
Jiri Slaby
d88d4050dc PM / Hibernate: user.c, fix SNAPSHOT_SET_SWAP_AREA handling
When CONFIG_DEBUG_BLOCK_EXT_DEVT is set we decode the device
improperly by old_decode_dev and it results in an error while
hibernating with s2disk.

All users already pass the new device number, so switch to
new_decode_dev().

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-and-tested-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: "Rafael J. Wysocki" <rjw@sisk.pl>
2010-04-10 22:28:56 +02:00
Linus Torvalds
2aedd192f7 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix sched_getaffinity()
2010-04-08 08:37:05 -07:00
KAMEZAWA Hiroyuki
a3a2e76c77 mm: avoid null-pointer deref in sync_mm_rss()
- We weren't zeroing p->rss_stat[] at fork()

- Consequently sync_mm_rss() was dereferencing tsk->mm for kernel
  threads and was oopsing.

- Make __sync_task_rss_stat() static, too.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=15648

[akpm@linux-foundation.org: remove the BUG_ON(!mm->rss)]
Reported-by: Troels Liebe Bentsen <tlb@rapanden.dk>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
"Michael S. Tsirkin" <mst@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07 08:38:02 -07:00
Linus Torvalds
94c4fcec01 Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Force MSI irq handlers to run with interrupts disabled
2010-04-06 13:03:22 -07:00
Anton Blanchard
84fba5ec91 sched: Fix sched_getaffinity()
taskset on 2.6.34-rc3 fails on one of my ppc64 test boxes with
the following error:

  sched_getaffinity(0, 16, 0x10029650030) = -1 EINVAL (Invalid argument)

This box has 128 threads and 16 bytes is enough to cover it.

Commit cd3d8031eb (sched:
sched_getaffinity(): Allow less than NR_CPUS length) is
comparing this 16 bytes agains nr_cpu_ids.

Fix it by comparing nr_cpu_ids to the number of bits in the
cpumask we pass in.

Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Sharyathi Nagesh <sharyath@in.ibm.com>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Russ Anderson <rja@sgi.com>
Cc: Mike Travis <travis@sgi.com>
LKML-Reference: <20100406070218.GM5594@kryten>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-06 10:01:35 +02:00
Nick Piggin
5fbfb18d7a Fix up possibly racy module refcounting
Module refcounting is implemented with a per-cpu counter for speed.
However there is a race when tallying the counter where a reference may
be taken by one CPU and released by another.  Reference count summation
may then see the decrement without having seen the previous increment,
leading to lower than expected count.  A module which never has its
actual reference drop below 1 may return a reference count of 0 due to
this race.

Module removal generally runs under stop_machine, which prevents this
race causing bugs due to removal of in-use modules.  However there are
other real bugs in module.c code and driver code (module_refcount is
exported) where the callers do not run under stop_machine.

Fix this by maintaining running per-cpu counters for the number of
module refcount increments and the number of refcount decrements.  The
increments are tallied after the decrements, so any decrement seen will
always have its corresponding increment counted.  The final refcount is
the difference of the total increments and decrements, preventing a
low-refcount from being returned.

Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-05 19:50:02 -07:00
Eric Paris
449cedf099 audit: preface audit printk with audit
There have been a number of reports of people seeing the message:
"name_count maxed, losing inode data: dev=00:05, inode=3185"
in dmesg.  These usually lead to people reporting problems to the filesystem
group who are in turn clueless what they mean.

Eventually someone finds me and I explain what is going on and that
these come from the audit system.  The basics of the problem is that the
audit subsystem never expects a single syscall to 'interact' (for some
wish washy meaning of interact) with more than 20 inodes.  But in fact
some operations like loading kernel modules can cause changes to lots of
inodes in debugfs.

There are a couple real fixes being bandied about including removing the
fixed compile time limit of 20 or not auditing changes in debugfs (or
both) but neither are small and obvious so I am not sending them for
immediate inclusion (I hope Al forwards a real solution next devel
window).

In the meantime this patch simply adds 'audit' to the beginning of the
crap message so if a user sees it, they come blame me first and we can
talk about what it means and make sure we understand all of the reasons
it can happen and make sure this gets solved correctly in the long run.

Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-05 13:19:45 -07:00
Linus Torvalds
b66696e3c0 Merge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc
* 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc:
  eeepc-wmi: include slab.h
  staging/otus: include slab.h from usbdrv.h
  percpu: don't implicitly include slab.h from percpu.h
  kmemcheck: Fix build errors due to missing slab.h
  include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
  iwlwifi: don't include iwl-dev.h from iwl-devtrace.h
  x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h

Fix up trivial conflicts in include/linux/percpu.h due to
is_kernel_percpu_address() having been introduced since the slab.h
cleanup with the percpu_up.c splitup.
2010-04-05 09:39:11 -07:00
Linus Torvalds
9e74e7c81a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  module: add stub for is_module_percpu_address
  percpu, module: implement and use is_kernel/module_percpu_address()
  module: encapsulate percpu handling better and record percpu_size
2010-04-05 09:16:37 -07:00
Tejun Heo
336f5899d2 Merge branch 'master' into export-slabh 2010-04-05 11:37:28 +09:00
Linus Torvalds
8ce42c8b7f Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Always build the powerpc perf_arch_fetch_caller_regs version
  perf: Always build the stub perf_arch_fetch_caller_regs version
  perf, probe-finder: Build fix on Debian
  perf/scripts: Tuple was set from long in both branches in python_process_event()
  perf: Fix 'perf sched record' deadlock
  perf, x86: Fix callgraphs of 32-bit processes on 64-bit kernels
  perf, x86: Fix AMD hotplug & constraint initialization
  x86: Move notify_cpu_starting() callback to a later stage
  x86,kgdb: Always initialize the hw breakpoint attribute
  perf: Use hot regs with software sched switch/migrate events
  perf: Correctly align perf event tracing buffer
2010-04-04 12:13:10 -07:00
Linus Torvalds
0121b0c771 Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: set_cpus_allowed_ptr(): Don't use rq->migration_thread after unlock
  sched: Fix proc_sched_set_task()
2010-04-04 12:12:31 -07:00
Linus Torvalds
a8941b0ed0 Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  ring-buffer: Add missing unlock
  tracing: Fix lockdep warning in global_clock()
2010-04-04 12:12:19 -07:00
Frederic Weisbecker
26d80aa782 perf: Always build the stub perf_arch_fetch_caller_regs version
Now that software events use perf_arch_fetch_caller_regs() too, we
need the stub version to be always built in for archs that don't
implement it.

Fixes the following build error in PARISC:

	kernel/built-in.o: In function `perf_event_task_sched_out':
	(.text.perf_event_task_sched_out+0x54): undefined reference to `perf_arch_fetch_caller_regs'

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2010-04-03 12:22:05 +02:00
Linus Torvalds
5e123e5d9b Merge branch 'kgdb-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'kgdb-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
  kgdb: Turn off tracing while in the debugger
  kgdb: use atomic_inc and atomic_dec instead of atomic_set
  kgdb: eliminate kgdb_wait(), all cpus enter the same way
  kgdbts,sh: Add in breakpoint pc offset for superh
  kgdb: have ebin2mem call probe_kernel_write once
2010-04-02 19:45:05 -07:00
Linus Torvalds
24b99d1576 Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  Freezer: Fix buggy resume test for tasks frozen with cgroup freezer
  Freezer: Only show the state of tasks refusing to freeze
2010-04-02 19:44:42 -07:00
Jason Wessel
4da75b9cea kgdb: Turn off tracing while in the debugger
The kernel debugger should turn off kernel tracing any time the
debugger is active and restore it on resume.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
2010-04-02 14:58:19 -05:00
Jason Wessel
ae6bf53e02 kgdb: use atomic_inc and atomic_dec instead of atomic_set
Memory barriers should be used for the kgdb cpu synchronization.  The
atomic_set() does not imply a memory barrier.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-04-02 14:58:18 -05:00
Jason Wessel
62fae31219 kgdb: eliminate kgdb_wait(), all cpus enter the same way
This is a kgdb architectural change to have all the cpus (master or
slave) enter the same function.

A cpu that hits an exception (wants to be the master cpu) will call
kgdb_handle_exception() from the trap handler and then invoke a
kgdb_roundup_cpu() to synchronize the other cpus and bring them into
the kgdb_handle_exception() as well.

A slave cpu will enter kgdb_handle_exception() from the
kgdb_nmicallback() and set the exception state to note that the
processor is a slave.

Previously the salve cpu would have called kgdb_wait().  This change
allows the debug core to change cpus without resuming the system in
order to inspect arch specific cpu information.

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-04-02 14:58:18 -05:00
Jason Wessel
a0279bd580 kgdb: have ebin2mem call probe_kernel_write once
Rather than call probe_kernel_write() one byte at a time, process the
whole buffer locally and pass the entire result in one go.  This way,
architectures that need to do special handling based on the length can
do so, or we only end up calling memcpy() once.

[sonic.zhang@analog.com: Reported original problem and preliminary patch]

Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Sonic Zhang <sonic.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2010-04-02 14:58:17 -05:00
Oleg Nesterov
47a70985e5 sched: set_cpus_allowed_ptr(): Don't use rq->migration_thread after unlock
Trivial typo fix. rq->migration_thread can be NULL after
task_rq_unlock(), this is why we have "mt" which should be
 used instead.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100330165829.GA18284@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-04-02 20:11:05 +02:00