Commit Graph

12445 Commits

Author SHA1 Message Date
Linus Torvalds
a4a4923919 Merge branch 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
* 'for-3.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroups: fix a css_set not found bug in cgroup_attach_proc
2011-12-20 11:44:18 -08:00
Linus Torvalds
5fbd305dd2 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time/clocksource: Fix kernel-doc warnings
  rtc: m41t80: Workaround broken alarm functionality
  rtc: Expire alarms after the time is set.
2011-12-20 11:42:38 -08:00
Michel Lespinasse
3d3c8f93a2 binary_sysctl(): fix memory leak
binary_sysctl() calls sysctl_getname() which allocates from names_cache
slab usin __getname()

The matching function to free the name is __putname(), and not putname()
which should be used only to match getname() allocations.

This is because when auditing is enabled, putname() calls audit_putname
*instead* (not in addition) to __putname().  Then, if a syscall is in
progress, audit_putname does not release the name - instead, it expects
the name to get released when the syscall completes, but that will happen
only if audit_getname() was called previously, i.e.  if the name was
allocated with getname() rather than the naked __getname().  So,
__getname() followed by putname() ends up leaking memory.

Signed-off-by: Michel Lespinasse <walken@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-20 10:25:04 -08:00
David Rientjes
b246272ecc cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
Kernels where MAX_NUMNODES > BITS_PER_LONG may temporarily see an empty
nodemask in a tsk's mempolicy if its previous nodemask is remapped onto a
new set of allowed cpuset nodes where the two nodemasks, as a result of
the remap, are now disjoint.

c0ff7453bb ("cpuset,mm: fix no node to alloc memory when changing
cpuset's mems") adds get_mems_allowed() to prevent the set of allowed
nodes from changing for a thread.  This causes any update to a set of
allowed nodes to stall until put_mems_allowed() is called.

This stall is unncessary, however, if at least one node remains unchanged
in the update to the set of allowed nodes.  This was addressed by
89e8a244b9 ("cpusets: avoid looping when storing to mems_allowed if one
node remains set"), but it's still possible that an empty nodemask may be
read from a mempolicy because the old nodemask may be remapped to the new
nodemask during rebind.  To prevent this, only avoid the stall if there is
no mempolicy for the thread being changed.

This is a temporary solution until all reads from mempolicy nodemasks can
be guaranteed to not be empty without the get_mems_allowed()
synchronization.

Also moves the check for nodemask intersection inside task_lock() so that
tsk->mems_allowed cannot change.  This ensures that nothing can set this
tsk's mems_allowed out from under us and also protects tsk->mempolicy.

Reported-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-20 10:25:04 -08:00
Mandeep Singh Baines
e0197aae59 cgroups: fix a css_set not found bug in cgroup_attach_proc
There is a BUG when migrating a PF_EXITING proc. Since css_set_prefetch()
is not called for the PF_EXITING case, find_existing_css_set() will return
NULL inside cgroup_task_migrate() causing a BUG.

This bug is easy to reproduce. Create a zombie and echo its pid to
cgroup.procs.

$ cat zombie.c
\#include <unistd.h>

int main()
{
  if (fork())
      pause();
  return 0;
}
$

We are hitting this bug pretty regularly on ChromeOS.

This bug is already fixed by Tejun Heo's cgroup patchset which is
targetted for the next merge window:

https://lkml.org/lkml/2011/11/1/356

I've create a smaller patch here which just fixes this bug so that a
fix can be merged into the current release and stable.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Downstream-Bug-Report: http://crosbug.com/23953
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: containers@lists.linux-foundation.org
Cc: cgroups@vger.kernel.org
Cc: stable@kernel.org
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Paul Menage <paul@paulmenage.org>
Cc: Olof Johansson <olofj@chromium.org>
2011-12-19 09:09:09 -08:00
Kusanagi Kouichi
b1b73d0950 time/clocksource: Fix kernel-doc warnings
Fix various KernelDoc build warnings.

Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp>
Cc: John Stultz <johnstul@us.ibm.com>
Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jp
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-19 11:41:40 +01:00
Linus Torvalds
ab347d94d6 Merge branches 'perf-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf events: Fix ring_buffer_wakeup() brown paperbag bug

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix select_idle_sibling() regression in selecting an idle SMT sibling
  MAINTAINERS: Update tip.git related git trees
2011-12-17 14:03:50 -08:00
Peter Zijlstra
ab2789213d sched: Fix select_idle_sibling() regression in selecting an idle SMT sibling
Mike Galbraith reported that this recent commit:

   commit 4dcfe1025b
   Author: Peter Zijlstra <peterz@infradead.org>
   Date:   Thu Nov 10 13:01:10 2011 +0100

       sched: Avoid SMT siblings in select_idle_sibling() if possible

stopped selecting an idle SMT sibling when there are no idle
cores in a single socket system.

Intent of the select_idle_sibling() was to fallback to an idle
SMT sibling, if it fails to identify an idle core. But this
fallback was not happening on systems where all the scheduler
domains had `SD_SHARE_PKG_RESOURCES' flag set.

Fix it. Slightly bigger patch of cleaning all these goto's etc
is queued up for the next release.

Reported-by: Mike Galbraith <efault@gmx.de>
Reported-by: Alex Shi <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1323978421.1984.244.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-16 09:44:58 +01:00
Will Deacon
44b7f4b98d perf events: Fix ring_buffer_wakeup() brown paperbag bug
Commit 10c6db11 ("perf: Fix loss of notification with multi-event")
seems to unconditionally dereference event->rb in the wakeup handler,
this is wrong, there might not be a buffer attached.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111213152651.GP20297@mudshark.cambridge.arm.com
[ minor edits ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-14 08:44:53 +01:00
Linus Torvalds
975e32c287 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Do no try to schedule task events if there are none
  lockdep, kmemcheck: Annotate ->lock in lockdep_init_map()
  perf header: Use event_name() to get an event name
  perf stat: Failure with "Operation not supported"
2011-12-09 08:07:24 -08:00
Mandeep Singh Baines
031af165b1 sys_getppid: add missing rcu_dereference
In order to safely dereference current->real_parent inside an
rcu_read_lock, we need an rcu_dereference.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Pavel Emelyanov <xemul@openvz.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-09 07:50:29 -08:00
Peter Zijlstra
09dc3cf93f printk: avoid double lock acquire
Commit 4f2a8d3cf5 ("printk: Fix console_sem vs logbuf_lock unlock race")
introduced another silly bug where we would want to acquire an already
held lock.  Avoid this.

Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-09 07:50:28 -08:00
Linus Torvalds
09d9673d53 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  alarmtimers: Fix time comparison
  ptp: Fix clock_getres() implementation
2011-12-08 13:21:28 -08:00
Gleb Natapov
86b47c2549 perf: Do no try to schedule task events if there are none
perf_event_sched_in() shouldn't try to schedule task events if there
are none otherwise task's ctx->is_active will be set and will not be
cleared during sched_out. This will prevent newly added events from
being scheduled into the task context.

Fixes a boo-boo in commit 1d5f003f5a ("perf: Do not set task_ctx
pointer in cpuctx if there are no events in the context").

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111122140821.GF2557@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-07 16:31:22 +01:00
Linus Torvalds
091c0f86ba Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ftrace: Fix hash record accounting bug
  perf: Fix parsing of __print_flags() in TP_printk()
  jump_label: jump_label_inc may return before the code is patched
  ftrace: Remove force undef config value left for testing
  tracing: Restore system filter behavior
  tracing: fix event_subsystem ref counting
2011-12-06 11:54:33 -08:00
Yong Zhang
a33caeb118 lockdep, kmemcheck: Annotate ->lock in lockdep_init_map()
Since commit f59de89 ("lockdep: Clear whole lockdep_map on initialization"),
lockdep_init_map() will clear all the struct. But it will break
lock_set_class()/lock_set_subclass(). A typical race condition
is like below:

     CPU A                                   CPU B
lock_set_subclass(lockA);
 lock_set_class(lockA);
   lockdep_init_map(lockA);
     /* lockA->name is cleared */
     memset(lockA);
                                     __lock_acquire(lockA);
                                       /* lockA->class_cache[] is cleared */
                                       register_lock_class(lockA);
                                         look_up_lock_class(lockA);
                                           WARN_ON_ONCE(class->name !=
                                                     lock->name);

     lock->name = name;

So restore to what we have done before commit f59de89 but annotate
->lock with kmemcheck_mark_initialized() to suppress the kmemcheck
warning reported in commit f59de89.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Vegard Nossum <vegard.nossum@gmail.com>
Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20111109080451.GB8124@zhy
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-06 18:18:13 +01:00
Thomas Gleixner
c9c024b3f3 alarmtimers: Fix time comparison
The expiry function compares the timer against current time and does
not expire the timer when the expiry time is >= now. That's wrong. If
the timer is set for now, then it must expire.

Make the condition expiry > now for breaking out the loop.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: John Stultz <john.stultz@linaro.org>
Cc: stable@kernel.org
2011-12-06 11:38:32 +01:00
Linus Torvalds
232ea34455 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix loss of notification with multi-event
  perf, x86: Force IBS LVT offset assignment for family 10h
  perf, x86: Disable PEBS on SandyBridge chips
  trace_events_filter: Use rcu_assign_pointer() when setting ftrace_event_call->filter
  perf session: Fix crash with invalid CPU list
  perf python: Fix undefined symbol problem
  perf/x86: Enable raw event access to Intel offcore events
  perf: Don't use -ENOSPC for out of PMU resources
  perf: Do not set task_ctx pointer in cpuctx if there are no events in the context
  perf/x86: Fix PEBS instruction unwind
  oprofile, x86: Fix crash when unloading module (nmi timer mode)
  oprofile: Fix crash when unloading module (hr timer mode)
2011-12-05 16:54:00 -08:00
Linus Torvalds
40c043b077 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents: Set noop handler in clockevents_exchange_device()
  tick-broadcast: Stop active broadcast device when replacing it
  clocksource: Fix bug with max_deferment margin calculation
  rtc: Fix some bugs that allowed accumulating time drift in suspend/resume
  rtc: Disable the alarm in the hardware
2011-12-05 16:53:43 -08:00
Linus Torvalds
f14aa871c7 Merge branches 'core-urgent-for-linus' and 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  slab, lockdep: Fix silly bug

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Fix race condition when stopping the irq thread
2011-12-05 16:51:21 -08:00
Linus Torvalds
7125faceab Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched, x86: Avoid unnecessary overflow in sched_clock
  sched: Fix buglet in return_cfs_rq_runtime()
  sched: Avoid SMT siblings in select_idle_sibling() if possible
  sched: Set the command name of the idle tasks in SMP kernels
  sched, rt: Provide means of disabling cross-cpu bandwidth sharing
  sched: Document wait_for_completion_*() return values
  sched_fair: Fix a typo in the comment describing update_sd_lb_stats
  sched: Add a comment to effective_load() since it's a pain
2011-12-05 16:50:24 -08:00
Steven Rostedt
ddf6e0e507 ftrace: Fix hash record accounting bug
If the set_ftrace_filter is cleared by writing just whitespace to
it, then the filter hash refcounts will be decremented but not
updated. This causes two bugs:

1) No functions will be enabled for tracing when they all should be

2) If the users clears the set_ftrace_filter twice, it will crash ftrace:

------------[ cut here ]------------
WARNING: at /home/rostedt/work/git/linux-trace.git/kernel/trace/ftrace.c:1384 __ftrace_hash_rec_update.part.27+0x157/0x1a7()
Modules linked in:
Pid: 2330, comm: bash Not tainted 3.1.0-test+ #32
Call Trace:
 [<ffffffff81051828>] warn_slowpath_common+0x83/0x9b
 [<ffffffff8105185a>] warn_slowpath_null+0x1a/0x1c
 [<ffffffff810ba362>] __ftrace_hash_rec_update.part.27+0x157/0x1a7
 [<ffffffff810ba6e8>] ? ftrace_regex_release+0xa7/0x10f
 [<ffffffff8111bdfe>] ? kfree+0xe5/0x115
 [<ffffffff810ba51e>] ftrace_hash_move+0x2e/0x151
 [<ffffffff810ba6fb>] ftrace_regex_release+0xba/0x10f
 [<ffffffff8112e49a>] fput+0xfd/0x1c2
 [<ffffffff8112b54c>] filp_close+0x6d/0x78
 [<ffffffff8113a92d>] sys_dup3+0x197/0x1c1
 [<ffffffff8113a9a6>] sys_dup2+0x4f/0x54
 [<ffffffff8150cac2>] system_call_fastpath+0x16/0x1b
---[ end trace 77a3a7ee73794a02 ]---

Link: http://lkml.kernel.org/r/20111101141420.GA4918@debian

Reported-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05 13:28:47 -05:00
Gleb Natapov
bbbf7af4bf jump_label: jump_label_inc may return before the code is patched
If cpu A calls jump_label_inc() just after atomic_add_return() is
called by cpu B, atomic_inc_not_zero() will return value greater then
zero and jump_label_inc() will return to a caller before jump_label_update()
finishes its job on cpu B.

Link: http://lkml.kernel.org/r/20111018175551.GH17571@redhat.com

Cc: stable@vger.kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05 13:28:46 -05:00
Steven Rostedt
c7c6ec8bec ftrace: Remove force undef config value left for testing
A forced undef of a config value was used for testing and was
accidently left in during the final commit. This causes x86 to
run slower than needed while running function tracing as well
as causes the function graph selftest to fail when DYNMAIC_FTRACE
is not set. This is because the code in MCOUNT expects the ftrace
code to be processed with the config value set that happened to
be forced not set.

The forced config option was left in by:
    commit 6331c28c96
    ftrace: Fix dynamic selftest failure on some archs

Link: http://lkml.kernel.org/r/20111102150255.GA6973@debian

Cc: stable@vger.kernel.org
Reported-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05 13:28:45 -05:00
Li Zefan
27b14b56af tracing: Restore system filter behavior
Though not all events have field 'prev_pid', it was allowed to do this:

  # echo 'prev_pid == 100' > events/sched/filter

but commit 75b8e98263 (tracing/filter: Swap
entire filter of events) broke it without any reason.

Link: http://lkml.kernel.org/r/4EAF46CF.8040408@cn.fujitsu.com

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-12-05 13:28:45 -05:00