commit b4c30aad39 upstream.
Several leaks in audit_tree didn't get caught by commit
318b6d3d7d, including the leak on normal
exit in case of multiple rules refering to the same chunk.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 6f5d511489 upstream.
... aka "Al had badly fscked up when writing that thing and nobody
noticed until Eric had fixed leaks that used to mask the breakage".
The function essentially creates a copy of old array sans one element
and replaces the references to elements of original (they are on cyclic
lists) with those to corresponding elements of new one. After that the
old one is fair game for freeing.
First of all, there's a dumb braino: when we get to list_replace_init we
use indices for wrong arrays - position in new one with the old array
and vice versa.
Another bug is more subtle - termination condition is wrong if the
element to be excluded happens to be the last one. We shouldn't go
until we fill the new array, we should go until we'd finished the old
one. Otherwise the element we are trying to kill will remain on the
cyclic lists...
That crap used to be masked by several leaks, so it was not quite
trivial to hit. Eric had fixed some of those leaks a while ago and the
shit had hit the fan...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Stable commit 0399123f3d didn't match the
original upstream commit. The CONFIG_MMU check was added much too early
in the list disabling a lot of proc entries in the process.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit b45c6e76bc upstream.
When print-fatal-signals is enabled it's possible to dump any memory
reachable by the kernel to the log by simply jumping to that address from
user space.
Or crash the system if there's some hardware with read side effects.
The fatal signals handler will dump 16 bytes at the execution address,
which is fully controlled by ring 3.
In addition when something jumps to a unmapped address there will be up to
16 additional useless page faults, which might be potentially slow (and at
least is not very efficient)
Fortunately this option is off by default and only there on i386.
But fix it by checking for kernel addresses and also stopping when there's
a page fault.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit bd4f490a07 upstream.
The LTP cgroup test suite generates a "kernel BUG at kernel/cgroup.c:790!"
here in cgroup_diput():
/*
* if we're getting rid of the cgroup, refcount should ensure
* that there are no pidlists left.
*/
BUG_ON(!list_empty(&cgrp->pidlists));
The cgroup pidlist rework in 2.6.32 generates the BUG_ON, which is caused
when pidlist_array_load() calls cgroup_pidlist_find():
(1) if a matching cgroup_pidlist is found, it down_write's the mutex of the
pre-existing cgroup_pidlist, and increments its use_count.
(2) if no matching cgroup_pidlist is found, then a new one is allocated, it
down_write's its mutex, and the use_count is set to 0.
(3) the matching, or new, cgroup_pidlist gets returned back to pidlist_array_load(),
which increments its use_count -- regardless whether new or pre-existing --
and up_write's the mutex.
So if a matching list is ever encountered by cgroup_pidlist_find() during
the life of a cgroup directory, it results in an inflated use_count value,
preventing it from ever getting released by cgroup_release_pid_array().
Then if the directory is subsequently removed, cgroup_diput() hits the
BUG_ON() when it finds that the directory's cgroup is still populated with
a pidlist.
The patch simply removes the use_count increment when a matching pidlist
is found by cgroup_pidlist_find(), because it gets bumped by the calling
pidlist_array_load() function while still protected by the list's mutex.
Signed-off-by: Dave Anderson <anderson@redhat.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Ben Blum <bblum@andrew.cmu.edu>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 10b465aaf9 upstream.
Commit 35dead4 "modules: don't export section names of empty sections
via sysfs" changed the set of sections that have attributes, but did
not change the iteration over these attributes in add_notes_attrs().
This can lead to add_notes_attrs() creating attributes with the wrong
names or with null name pointers.
Introduce a sect_empty() function and use it in both add_sect_attrs()
and add_notes_attrs().
Reported-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Martin Michlmayr <tbm@cyrius.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 047106adcc upstream.
Heiko reported a case where a timer interrupt managed to
reference a root_domain structure that was already freed by a
concurrent hot-un-plug operation.
Solve this like the regular sched_domain stuff is also
synchronized, by adding a synchronize_sched() stmt to the free
path, this ensures that a root_domain stays present for any
atomic section that could have observed it.
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Gregory Haskins <ghaskins@novell.com>
Cc: Siddha Suresh B <suresh.b.siddha@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LKML-Reference: <1258363873.26714.83.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 6ad4c18884 upstream.
Since (e761b77: cpu hotplug, sched: Introduce cpu_active_map and redo
sched domain managment) we have cpu_active_mask which is suppose to rule
scheduler migration and load-balancing, except it never (fully) did.
The particular problem being solved here is a crash in try_to_wake_up()
where select_task_rq() ends up selecting an offline cpu because
select_task_rq_fair() trusts the sched_domain tree to reflect the
current state of affairs, similarly select_task_rq_rt() trusts the
root_domain.
However, the sched_domains are updated from CPU_DEAD, which is after the
cpu is taken offline and after stop_machine is done. Therefore it can
race perfectly well with code assuming the domains are right.
Cure this by building the domains from cpu_active_mask on
CPU_DOWN_PREPARE.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 70da2340fb upstream.
Jan Engelhardt reported we have this problem:
setting max_map_count to a value large enough results in programs dying at
first try. This is on 2.6.31.6:
15:59 borg:/proc/sys/vm # echo $[1<<31-1] >max_map_count
15:59 borg:/proc/sys/vm # cat max_map_count
1073741824
15:59 borg:/proc/sys/vm # echo $[1<<31] >max_map_count
15:59 borg:/proc/sys/vm # cat max_map_count
Killed
This is because we have a chance to make 'max_map_count' negative. but
it's meaningless. Make it only accept non-negative values.
Reported-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: James Morris <jmorris@namei.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 6e14154676 upstream.
In NOMMU mode clamp dac_mmap_min_addr to zero to cause the tests on it to be
skipped by the compiler. We do this as the minimum mmap address doesn't make
any sense in NOMMU mode.
mmap_min_addr and round_hint_to_min() can be discarded entirely in NOMMU mode.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit bb6eddf767 upstream.
Xiaotian Feng triggered a list corruption in the clock events list on
CPU hotplug and debugged the root cause.
If a CPU registers more than one per cpu clock event device, then only
the active clock event device is removed on CPU_DEAD. The unused
devices are kept in the clock events device list.
On CPU up the clock event devices are registered again, which means
that we list_add an already enqueued list_head. That results in list
corruption.
Resolve this by removing all devices which are associated to the dead
CPU on CPU_DEAD.
Reported-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Xiaotian Feng <dfeng@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit 0f624e7e56 upstream.
It is quite legitimate for CPUs to be numbered sparsely, meaning
that it possible for an online CPU to have a number which is
greater than the total count of possible CPUs.
Currently find_get_context() has a sanity check on the cpu
number where it checks it against num_possible_cpus(). This
test can fail for a legitimate cpu number if the
cpu_possible_mask is sparsely populated.
This fixes the problem by checking the CPU number against
nr_cpumask_bits instead, since that is the appropriate check to
ensure that the cpu number is same to pass to cpu_isset()
subsequently.
Reported-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Tested-by: Michael Neuling <mikey@neuling.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20091215084032.GA18661@brick.ozlabs.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit eae0c9dfb5 upstream.
Commit 1b9508f, "Rate-limit newidle" has been confirmed to fix
the netperf UDP loopback regression reported by Alex Shi.
This is a cleanup and a fix:
- moved to a more out of the way spot
- fix to ensure that balancing doesn't try to balance
runqueues which haven't gone online yet, which can
mess up CPU enumeration during boot.
Reported-by: Alex Shi <alex.shi@intel.com>
Reported-by: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1257821402.5648.17.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit a1f84a3ab8 upstream.
When waking affine, check for an idle shared cache, and if
found, wake to that CPU/sibling instead of the waker's CPU.
This improves pgsql+oltp ramp up by roughly 8%. Possibly more
for other loads, depending on overlap. The trade-off is a
roughly 1% peak downturn if tasks are truly synchronous.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1256654138.17752.7.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>