* 'core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (63 commits)
stacktrace: provide save_stack_trace_tsk() weak alias
rcu: provide RCU options on non-preempt architectures too
printk: fix discarding message when recursion_bug
futex: clean up futex_(un)lock_pi fault handling
"Tree RCU": scalable classic RCU implementation
futex: rename field in futex_q to clarify single waiter semantics
x86/swiotlb: add default swiotlb_arch_range_needs_mapping
x86/swiotlb: add default phys<->bus conversion
x86: unify pci iommu setup and allow swiotlb to compile for 32 bit
x86: add swiotlb allocation functions
swiotlb: consolidate swiotlb info message printing
swiotlb: support bouncing of HighMem pages
swiotlb: factor out copy to/from device
swiotlb: add arch hook to force mapping
swiotlb: allow architectures to override phys<->bus<->phys conversions
swiotlb: add comment where we handle the overflow of a dma mask on 32 bit
rcu: fix rcutorture behavior during reboot
resources: skip sanity check of busy resources
swiotlb: move some definitions to header
swiotlb: allow architectures to override swiotlb pool allocation
...
Fix up trivial conflicts in
arch/x86/kernel/Makefile
arch/x86/mm/init_32.c
include/linux/hardirq.h
as per Ingo's suggestions.
* 'tracing-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (241 commits)
sched, trace: update trace_sched_wakeup()
tracing/ftrace: don't trace on early stage of a secondary cpu boot, v3
Revert "x86: disable X86_PTRACE_BTS"
ring-buffer: prevent false positive warning
ring-buffer: fix dangling commit race
ftrace: enable format arguments checking
x86, bts: memory accounting
x86, bts: add fork and exit handling
ftrace: introduce tracing_reset_online_cpus() helper
tracing: fix warnings in kernel/trace/trace_sched_switch.c
tracing: fix warning in kernel/trace/trace.c
tracing/ring-buffer: remove unused ring_buffer size
trace: fix task state printout
ftrace: add not to regex on filtering functions
trace: better use of stack_trace_enabled for boot up code
trace: add a way to enable or disable the stack tracer
x86: entry_64 - introduce FTRACE_ frame macro v2
tracing/ftrace: add the printk-msg-only option
tracing/ftrace: use preempt_enable_no_resched_notrace in ring_buffer_time_stamp()
x86, bts: correctly report invalid bts records
...
Fixed up trivial conflict in scripts/recordmcount.pl due to SH bits
being already partly merged by the SH merge.
For CONFIG_SPARSEMEM_VMEMMAP=y on s390 I get warnings like
init/main.c: In function 'start_kernel':
init/main.c:641: warning: format '%08lx' expects type 'long unsigned int', but argument 2 has type 'int'
The warning can be suppressed with a cast to unsigned long in the
CONFIG_SPARSEMEM_VMEMMAP=y version of __page_to_pfn.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Impact: Cleanup and branch hints only.
Move the track and untrack pfn stub routines from memory.c to asm-generic.
Also add unlikely to pfnmap related calls in fork and exit path.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: New mm functionality.
Add pgprot_writecombine. pgprot_writecombine will be aliased to
pgprot_noncached when not supported by the architecture.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Impact: reduce bug table size
This allows reducing the bug table size by half. Perhaps there are
other 64-bit architectures that could also make use of this.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: let the function-graph-tracer be aware of the irq entrypoints
Add a new .irqentry.text section to store the irq entrypoints functions
inside the same section. This way, the tracer will be able to signal
an interrupts triggering on output by recognizing these entrypoints.
Also, make this section recordable for dynamic tracing.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If you enable some common audit code, the kernel fails to build.
In file included from lib/audit.c:17:
include/asm-generic/audit_write.h:3: error: '__NR_swapon' undeclared here (not in a function)
make[1]: *** [lib/audit.o] Error 1
make: *** [lib] Error 2
So do not use __NR_swapon if it isnt defined for a port.
Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Impact: cleanup, eliminate code
now that warn_on_slowpath() uses warn_slowpath(...,NULL), we can
eliminate warn_on_slowpath() altogether and use warn_slowpath().
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: feature to profile if statements
This patch adds a branch profiler for all if () statements.
The results will be found in:
/debugfs/tracing/profile_branch
For example:
miss hit % Function File Line
------- --------- - -------- ---- ----
0 1 100 x86_64_start_reservations head64.c 127
0 1 100 copy_bootdata head64.c 69
1 0 0 x86_64_start_kernel head64.c 111
32 0 0 set_intr_gate desc.h 319
1 0 0 reserve_ebda_region head.c 51
1 0 0 reserve_ebda_region head.c 47
0 1 100 reserve_ebda_region head.c 42
0 0 X maxcpus main.c 165
Miss means the branch was not taken. Hit means the branch was taken.
The percent is the percentage the branch was taken.
This adds a significant amount of overhead and should only be used
by those analyzing their system.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: clean up to make one profiler of like and unlikely tracer
The likely and unlikely profiler prints out the file and line numbers
of the annotated branches that it is profiling. It shows the number
of times it was correct or incorrect in its guess. Having two
different files or sections for that matter to tell us if it was a
likely or unlikely is pretty pointless. We really only care if
it was correct or not.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: API *CHANGE*. Must update all tracepoint users.
Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
structure in a single spot for all the kernel. It helps reducing memory
consumption, especially when declaring a lot of tracepoints, e.g. for
kmalloc tracing.
*API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
to do it. The name previously used was misleading.
Updates scheduler instrumentation to follow this API change.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: name change of unlikely tracer and profiler
Ingo Molnar suggested changing the config from UNLIKELY_PROFILE
to BRANCH_PROFILING. I never did like the "unlikely" name so I
went one step farther, and renamed all the unlikely configurations
to a "BRANCH" variant.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: new unlikely/likely profiler
Andrew Morton recently suggested having an in-kernel way to profile
likely and unlikely macros. This patch achieves that goal.
When configured, every(*) likely and unlikely macro gets a counter attached
to it. When the condition is hit, the hit and misses of that condition
are recorded. These numbers can later be retrieved by:
/debugfs/tracing/profile_likely - All likely markers
/debugfs/tracing/profile_unlikely - All unlikely markers.
# cat /debug/tracing/profile_unlikely | head
correct incorrect % Function File Line
------- --------- - -------- ---- ----
2167 0 0 do_arch_prctl process_64.c 832
0 0 0 do_arch_prctl process_64.c 804
2670 0 0 IS_ERR err.h 34
71230 5693 7 __switch_to process_64.c 673
76919 0 0 __switch_to process_64.c 639
43184 33743 43 __switch_to process_64.c 624
12740 64181 83 __switch_to process_64.c 594
12740 64174 83 __switch_to process_64.c 590
# cat /debug/tracing/profile_unlikely | \
awk '{ if ($3 > 25) print $0; }' |head -20
44963 35259 43 __switch_to process_64.c 624
12762 67454 84 __switch_to process_64.c 594
12762 67447 84 __switch_to process_64.c 590
1478 595 28 syscall_get_error syscall.h 51
0 2821 100 syscall_trace_leave ptrace.c 1567
0 1 100 native_smp_prepare_cpus smpboot.c 1237
86338 265881 75 calc_delta_fair sched_fair.c 408
210410 108540 34 calc_delta_mine sched.c 1267
0 54550 100 sched_info_queued sched_stats.h 222
51899 66435 56 pick_next_task_fair sched_fair.c 1422
6 10 62 yield_task_fair sched_fair.c 982
7325 2692 26 rt_policy sched.c 144
0 1270 100 pre_schedule_rt sched_rt.c 1261
1268 48073 97 pick_next_task_rt sched_rt.c 884
0 45181 100 sched_info_dequeued sched_stats.h 177
0 15 100 sched_move_task sched.c 8700
0 15 100 sched_move_task sched.c 8690
53167 33217 38 schedule sched.c 4457
0 80208 100 sched_info_switch sched_stats.h 270
30585 49631 61 context_switch sched.c 2619
# cat /debug/tracing/profile_likely | awk '{ if ($3 > 25) print $0; }'
39900 36577 47 pick_next_task sched.c 4397
20824 15233 42 switch_mm mmu_context_64.h 18
0 7 100 __cancel_work_timer workqueue.c 560
617 66484 99 clocksource_adjust timekeeping.c 456
0 346340 100 audit_syscall_exit auditsc.c 1570
38 347350 99 audit_get_context auditsc.c 732
0 345244 100 audit_syscall_entry auditsc.c 1541
38 1017 96 audit_free auditsc.c 1446
0 1090 100 audit_alloc auditsc.c 862
2618 1090 29 audit_alloc auditsc.c 858
0 6 100 move_masked_irq migration.c 9
1 198 99 probe_sched_wakeup trace_sched_switch.c 58
2 2 50 probe_wakeup trace_sched_wakeup.c 227
0 2 100 probe_wakeup_sched_switch trace_sched_wakeup.c 144
4514 2090 31 __grab_cache_page filemap.c 2149
12882 228786 94 mapping_unevictable pagemap.h 50
4 11 73 __flush_cpu_slab slub.c 1466
627757 330451 34 slab_free slub.c 1731
2959 61245 95 dentry_lru_del_init dcache.c 153
946 1217 56 load_elf_binary binfmt_elf.c 904
102 82 44 disk_put_part genhd.h 206
1 1 50 dst_gc_task dst.c 82
0 19 100 tcp_mss_split_point tcp_output.c 1126
As you can see by the above, there's a bit of work to do in rethinking
the use of some unlikelys and likelys. Note: the unlikely case had 71 hits
that were more than 25%.
Note: After submitting my first version of this patch, Andrew Morton
showed me a version written by Daniel Walker, where I picked up
the following ideas from:
1) Using __builtin_constant_p to avoid profiling fixed values.
2) Using __FILE__ instead of instruction pointers.
3) Using the preprocessor to stop all profiling of likely
annotations from vsyscall_64.c.
Thanks to Andrew Morton, Arjan van de Ven, Theodore Tso and Ingo Molnar
for their feed back on this patch.
(*) Not ever unlikely is recorded, those that are used by vsyscalls
(a few of them) had to have profiling disabled.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Theodore Tso <tytso@mit.edu>
Cc: Arjan van de Ven <arjan@infradead.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix the __pfn_to_page(pfn) macro so that it doesn't evaluate its
argument twice in the CONFIG_DISCONTIGMEM=y case, because 'pfn' may
be a result of a funtion call having side effects.
For example, the hibernation code applies pfn_to_page(pfn) to the
result of a function returning the pfn corresponding to the next set
bit in a bitmap and the current bit position is modified on each
call. This leads to "interesting" failures for CONFIG_DISCONTIGMEM=y
due to the current behavior of __pfn_to_page(pfn).
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: build fix
DIE_OOPS is now used in the generic trace handling code so it needs to
be defined for all architectures. Define it in asm-generic so that it's
available to all by default and doesn't cause build errors for
architectures that rely on the generic implementation.
Signed-off-by: Jonas Bonn <jonas.bonn@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
- atomic operations which both modify the variable and return something imply
full smp memory barriers before and after the memory operations involved
(failing atomic_cmpxchg, atomic_add_unless, etc don't imply a barrier because
they don't modify the target). See Documentation/atomic_ops.txt.
So remove extra barriers and branches.
- All architectures support atomic_cmpxchg. This has no relation to
__HAVE_ARCH_CMPXCHG. We can just take the atomic_cmpxchg path unconditionally
This reduces a simple single threaded fastpath lock+unlock test from 590 cycles
to 203 cycles on a ppc970 system.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
powerpc doesn't use the generic WARN_ON infrastructure. The newly
introduced WARN() as a result didn't print the message, this patch adds
the printk for this specific case.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>