* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM
* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Remove unused CHECK_IRQ_PER_CPU()
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools, x86: Fix 32-bit compile on 64-bit system
Builds for 32-bit perf binaries on a 64-bit host currently fail
with this error:
[...]
bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
bench/../../../arch/x86/lib/memcpy_64.S:29: Error: bad register name `%rdi'
bench/../../../arch/x86/lib/memcpy_64.S:34: Error: invalid instruction suffix for `movs'
bench/../../../arch/x86/lib/memcpy_64.S:50: Error: bad register name `%rdi'
bench/../../../arch/x86/lib/memcpy_64.S:61: Error: bad register name `%rdi'
...
The problem is the detection of the host arch without considering passed in
flags. This change fixes 32-bit builds via:
make EXTRA_CFLAGS=-m32
and 64-bit builds still reference the memcpy_64.S.
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310420304-21452-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.
With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.
-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing events
Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:
# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.
Adding read/write/prefetch as allowed operations for node cache
event.
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.
E.g)
# perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
-a "extent_io_init:5 extent_state_cache"
Add new events:
probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)
You can now use it on all perf tools, such as:
perf record -e probe:extent_io_init_1 -aR sleep 1
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.
This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.
The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.
v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentation
v3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Previously, when you want perf-stat to output the statistics in
csv mode, no information of the noise will be printed out.
For example right now we output this --repeat information:
./perf stat -r3 -x, sleep 1
1.164789,task-clock
8,context-switches
0,CPU-migrations
219,page-faults
3337800,cycles
With this patch, the output will be appended with an additional
entry for the noise value:
./perf stat -r3 -x, sleep 1
1.164789,task-clock,3.75%
8,context-switches,75.00%
0,CPU-migrations,100.00%
219,page-faults,0.00%
3337800,cycles,3.36%
Signed-off-by: Zhengyu He <zhengyuh@google.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).
We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
As for newt ui, don't display entries that have been marked
as ignored.
The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:
# Overhead Command Shared Object Symbol
# ........ ........... ................. ............
#
^A
|
--- __lock_acquire
|
|--95.97%-- lock_acquire
| |
| |--30.75%-- _raw_spin_lock
Discard these from the stdio display.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>