Commit Graph

7615 Commits

Author SHA1 Message Date
Xiao Guangrong
04aef32d39 tracing/function: Fix the return value of ftrace_trace_onoff_callback()
ftrace_trace_onoff_callback() will return an error even if we do the
right operation, for example:

 # echo _spin_*:traceon:10 > set_ftrace_filter
 -bash: echo: write error: Invalid argument
 # cat set_ftrace_filter
 #### all functions enabled ####
 _spin_trylock_bh:traceon:count=10
 _spin_unlock_irq:traceon:count=10
 _spin_unlock_bh:traceon:count=10
 _spin_lock_irq:traceon:count=10
 _spin_unlock:traceon:count=10
 _spin_trylock:traceon:count=10
 _spin_unlock_irqrestore:traceon:count=10
 _spin_lock_irqsave:traceon:count=10
 _spin_lock_bh:traceon:count=10
 _spin_lock:traceon:count=10

We want to set _spin_*:traceon:10 to set_ftrace_filter, it complains
with "Invalid argument", but the operation is successful.

This is because ftrace_process_regex() returns the number of functions that
matched the pattern. If the number is not 0, this value is returned
by ftrace_regex_write() whereas we want to return the number of bytes
virtually written.
Also the file offset pointer is not updated in this case.

If the number of matched functions is lower than the number of bytes written
by the user, this results to a reprocessing of the string given by the user with
a lower size, leading to a malformed ftrace regex and then a -EINVAL returned.

So, this patch fixes it by returning 0 if no error occured.
The fix also applies on 2.6.30

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Cc: stable@kernel.org
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-07-16 23:34:32 -04:00
Linus Torvalds
4b0a84043e Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-sched
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/linux-2.6-sched:
  sched: Fix bug in SCHED_IDLE interaction with group scheduling
  sched: Fix rt_rq->pushable_tasks initialization in init_rt_rq()
  sched: Reset sched stats on fork()
  sched_rt: Fix overload bug on rt group scheduling
  sched: Documentation/sched-rt-group: Fix style issues & bump version
2009-07-16 10:18:29 -07:00
Linus Torvalds
62f49052ac Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  hrtimer: Fix migration expiry check
  hrtimer: migration: do not check expiry time on current CPU
2009-07-14 18:35:24 -07:00
Linus Torvalds
989fa94096 Merge branch 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  futexes: Fix infinite loop in get_futex_key() on huge page
2009-07-14 18:35:00 -07:00
Steven Rostedt
6ab5d668b1 tracing/function-profiler: do not free per cpu variable stat
The per cpu variable stat is freeded if we fail to allocate a name
on start up. This was due to stat at first being allocated in the
initial design. But since then, it has become a static per cpu variable
but the free on error was not removed.

Also added __init annotation to the function that this is in.

[ Impact: prevent possible memory corruption on low mem at boot up ]

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-13 11:01:10 +02:00
Linus Torvalds
7638d5322b Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6
* 'kmemleak' of git://linux-arm.org/linux-2.6:
  kmemleak: Remove alloc_bootmem annotations introduced in the past
  kmemleak: Add callbacks to the bootmem allocator
  kmemleak: Allow partial freeing of memory blocks
  kmemleak: Trace the kmalloc_large* functions in slub
  kmemleak: Scan objects allocated during a scanning episode
  kmemleak: Do not acquire scan_mutex in kmemleak_open()
  kmemleak: Remove the reported leaks number limitation
  kmemleak: Add more cond_resched() calls in the scanning thread
  kmemleak: Renice the scanning thread to +10
2009-07-12 12:24:35 -07:00
Alexey Dobriyan
405f55712d headers: smp_lock.h redux
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
  It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT

  This will make hardirq.h inclusion cheaper for every PREEMPT=n config
  (which includes allmodconfig/allyesconfig, BTW)

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-12 12:22:34 -07:00
Sonny Rao
ce2ae53b75 futexes: Fix infinite loop in get_futex_key() on huge page
get_futex_key() can infinitely loop if it is called on a
virtual address that is within a huge page but not aligned to
the beginning of that page.  The call to get_user_pages_fast
will return the struct page for a sub-page within the huge page
and the check for page->mapping will always fail.

The fix is to call compound_head on the page before checking
that it's mapped.

Signed-off-by: Sonny Rao <sonnyrao@us.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
Cc: anton@samba.org
Cc: rajamony@us.ibm.com
Cc: speight@us.ibm.com
Cc: mstephen@us.ibm.com
Cc: grimm@us.ibm.com
Cc: mikey@ozlabs.au.ibm.com
LKML-Reference: <20090710231313.GA23572@us.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11 12:40:44 +02:00
Paul Turner
d07387b490 sched: Fix bug in SCHED_IDLE interaction with group scheduling
One of the isolation modifications for SCHED_IDLE is the
unitization of sleeper credit.  However the check for this
assumes that the sched_entity we're placing always belongs to a
task.

This is potentially not true with group scheduling and leaves
us rummaging randomly when we try to pull the policy.

Signed-off-by: Paul Turner <pjt@google.com>
Cc: peterz@infradead.org
LKML-Reference: <alpine.DEB.1.00.0907101649570.29914@kitami.corp.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-11 10:00:09 +02:00
Linus Torvalds
ac3f482236 Merge branch 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  dma-debug: Fix the overlap() function to be correct and readable
  oprofile: reset bt_lost_no_mapping with other stats
  x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
  signals: declare sys_rt_tgsigqueueinfo in syscalls.h
  rcu: Mark Hierarchical RCU no longer experimental
  dma-debug: Put all hash-chain locks into the same lock class
  dma-debug: fix off-by-one error in overlap function
2009-07-10 14:25:59 -07:00
Peter Zijlstra
d86ee4809d sched: optimize cond_resched()
Optimize cond_resched() by removing one conditional.

Currently cond_resched() checks system_state ==
SYSTEM_RUNNING in order to avoid scheduling before the
scheduler is running.

We can however, as per suggestion of Matt, use
PREEMPT_ACTIVE to accomplish that very same.

Suggested-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Matt Mackall <mpm@selenic.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-10 14:24:05 -07:00
Linus Torvalds
2a6f86bc5e Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tracing: Fix trace_print_seq()
  kprobes: No need to unlock kprobe_insn_mutex
  tracing/fastboot: Document the need of initcall_debug
  trace_export: Repair missed fields
  tracing: Fix stack tracer sysctl handling
2009-07-10 11:41:41 -07:00
Thomas Gleixner
6ff7041dbf hrtimer: Fix migration expiry check
The timer migration expiry check should prevent the migration of a
timer to another CPU when the timer expires before the next event is
scheduled on the other CPU. Migrating the timer might delay it because
we can not reprogram the clock event device on the other CPU. But the
code implementing that check has two flaws:

- for !HIGHRES the check compares the expiry value with the clock
  events device expiry value which is wrong for CLOCK_REALTIME based
  timers.

- the check is racy. It holds the hrtimer base lock of the target CPU,
  but the clock event device expiry value can be modified
  nevertheless, e.g. by an timer interrupt firing.

The !HIGHRES case is easy to fix as we can enqueue the timer on the
cpu which was selected by the load balancer. It runs the idle
balancing code once per jiffy anyway. So the maximum delay for the
timer is the same as when we keep the tick on the current cpu going.

In the HIGHRES case we can get the next expiry value from the hrtimer
cpu_base of the target CPU and serialize the update with the cpu_base
lock. This moves the lock section in hrtimer_interrupt() so we can set
next_event to KTIME_MAX while we are handling the expired timers and
set it to the next expiry value after we handled the timers under the
base lock. While the expired timers are processed timer migration is
blocked because the expiry time of the timer is always <= KTIME_MAX.

Also remove the now useless clockevents_get_next_event() function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-10 17:32:55 +02:00
Thomas Gleixner
7e0c5086c1 hrtimer: migration: do not check expiry time on current CPU
The timer migration code needs to check whether the expiry time of the
timer is before the programmed clock event expiry time when the timer
is enqueued on another CPU because we can not reprogram the timer
device on the other CPU. The current logic checks the expiry time even
if we enqueue on the current CPU when nohz_get_load_balancer() returns
current CPU. This might lead to an endless loop in the expiry check
code when the expiry time of the timer is before the current
programmed next event.

Check whether nohz_get_load_balancer() returns current CPU and skip
the expiry check if this is the case.

The bug was triggered from the networking code. The patch fixes the
regression http://bugzilla.kernel.org/show_bug.cgi?id=13738
(Soft-Lockup/Race in networking in 2.6.31-rc1+195)

Cc: Arun Bharadwaj <arun@linux.vnet.ibm.com
Tested-by: Joao Correia <joaomiguelcorreia@gmail.com>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2009-07-10 17:22:20 +02:00
Fabio Checconi
c20b08e398 sched: Fix rt_rq->pushable_tasks initialization in init_rt_rq()
init_rt_rq() initializes only rq->rt.pushable_tasks, and not the
pushable_tasks field of the passed rt_rq.  The plist is not used
uninitialized since the only pushable_tasks plists used are the
ones of root rt_rqs; anyway reinitializing the list on every group
creation corrupts the root plist, losing its previous contents.

Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20090615185638.GK21741@gandalf.sssup.it>
CC: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-10 10:43:30 +02:00
Lucas De Marchi
7793527b90 sched: Reset sched stats on fork()
The sched_stat fields are currently not reset upon fork.
Ingo's recent commit 6c594c21fc
did reset nr_migrations, but it didn't reset any of the
others.

This patch resets all sched_stat fields on fork.

Signed-off-by: Lucas De Marchi <lucas.de.marchi@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <193b0f820907090457s7a3662f4gcdecdc22fcae857b@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-10 10:43:29 +02:00
Peter Zijlstra
a1ba4d8ba9 sched_rt: Fix overload bug on rt group scheduling
Fixes an easily triggerable BUG() when setting process affinities.

Make sure to count the number of migratable tasks in the same place:
the root rt_rq. Otherwise the number doesn't make sense and we'll hit
the BUG in set_cpus_allowed_rt().

Also, make sure we only count tasks, not groups (this is probably
already taken care of by the fact that rt_se->nr_cpus_allowed will be 0
for groups, but be more explicit)

Tested-by: Thomas Gleixner <tglx@linutronix.de>
CC: stable@kernel.org
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Gregory Haskins <ghaskins@novell.com>
LKML-Reference: <1247067476.9777.57.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-10 10:43:29 +02:00
Catalin Marinas
264ef8a904 kmemleak: Remove alloc_bootmem annotations introduced in the past
kmemleak_alloc() calls were added in some places where alloc_bootmem was
called. Since now kmemleak tracks bootmem allocations, these explicit
calls should be run.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-07-09 17:07:02 +01:00
Joe Perches
ad361c9884 Remove multiple KERN_ prefixes from printk formats
Commit 5fd29d6ccb ("printk: clean up
handling of log-levels and newlines") changed printk semantics.  printk
lines with multiple KERN_<level> prefixes are no longer emitted as
before the patch.

<level> is now included in the output on each additional use.

Remove all uses of multiple KERN_<level>s in formats.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08 10:30:03 -07:00
Alexey Dobriyan
b43f3cbd21 headers: mnt_namespace.h redux
Fix various silly problems wrt mnt_namespace.h:

 - exit_mnt_ns() isn't used, remove it
 - done that, sched.h and nsproxy.h inclusions aren't needed
 - mount.h inclusion was need for vfsmount_lock, but no longer
 - remove mnt_namespace.h inclusion from files which don't use anything
   from mnt_namespace.h

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-08 09:31:56 -07:00
Oleg Nesterov
793285fcaf cred_guard_mutex: do not return -EINTR to user-space
do_execve() and ptrace_attach() return -EINTR if
mutex_lock_interruptible(->cred_guard_mutex) fails.

This is not right, change the code to return ERESTARTNOINTR.

Perhaps we should also change proc_pid_attr_write().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Roland McGrath <roland@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-06 13:57:04 -07:00
Kevin Cernekee
5bfd756097 Fix virt_to_phys() warnings
These warnings were observed on MIPS32 using 2.6.31-rc1 and gcc-4.2.0:

mm/page_alloc.c: In function 'alloc_pages_exact':
mm/page_alloc.c:1986: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast

drivers/usb/mon/mon_bin.c: In function 'mon_alloc_buff':
drivers/usb/mon/mon_bin.c:1264: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast

[akpm@linux-foundation.org: fix kernel/perf_counter.c too]
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-07-06 13:57:03 -07:00
Xiao Guangrong
e1af3aec3e tracing: Fix trace_print_seq()
We will lose something if trace_seq->buffer[0] is 0, because the copy length
is calculated by strlen() in seq_puts(), so using seq_write() instead of
seq_puts().

There have a example:
after reboot:

 # echo kmemtrace > current_tracer
 # echo 0 > options/kmem_minimalistic
 # cat trace
 # tracer: kmemtrace
 #
 #

Nothing is exported, because the first byte of trace_seq->buffer[ ]
is KMEMTRACE_USER_ALLOC.

( the value of KMEMTRACE_USER_ALLOC is zero, seeing
  kmemtrace_print_alloc_user() in kernel/trace/kmemtrace.c)

Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <4A4B2351.5010300@cn.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-02 08:51:13 +02:00
Masami Hiramatsu
4a2bb6fcc8 kprobes: No need to unlock kprobe_insn_mutex
Remove needless kprobe_insn_mutex unlocking during safety check
in garbage collection, because if someone releases a dirty slot
during safety check (which ensures other cpus doesn't execute
all dirty slots), the safety check must be fail. So, we need to
hold the mutex while checking safety.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
LKML-Reference: <20090630210809.17851.28781.stgit@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-07-01 10:43:07 +02:00
Linus Torvalds
e83c2b0ff3 Merge branch 'kmemleak' of git://linux-arm.org/linux-2.6
* 'kmemleak' of git://linux-arm.org/linux-2.6:
  kmemleak: Inform kmemleak about pid_hash
  kmemleak: Do not warn if an unknown object is freed
  kmemleak: Do not report new leaked objects if the scanning was stopped
  kmemleak: Slightly change the policy on newly allocated objects
  kmemleak: Do not trigger a scan when reading the debug/kmemleak file
  kmemleak: Simplify the reports logged by the scanning thread
  kmemleak: Enable task stacks scanning by default
  kmemleak: Allow the early log buffer to be configurable.
2009-06-30 19:04:53 -07:00