Commit Graph

1825 Commits

Author SHA1 Message Date
Linus Torvalds
c0c463d34a Merge branches 'x86-urgent-for-linus', 'core-debug-for-linus', 'irq-core-for-linus' and 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM

* 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Remove unused CHECK_IRQ_PER_CPU()

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools, x86: Fix 32-bit compile on 64-bit system
2011-07-23 10:33:08 -07:00
David Ahern
08a4a43fc4 perf tools, x86: Fix 32-bit compile on 64-bit system
Builds for 32-bit perf binaries on a 64-bit host currently fail
with this error:

 [...]
 bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
 bench/../../../arch/x86/lib/memcpy_64.S:29: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:34: Error: invalid instruction suffix for `movs'
 bench/../../../arch/x86/lib/memcpy_64.S:50: Error: bad register name `%rdi'
 bench/../../../arch/x86/lib/memcpy_64.S:61: Error: bad register name `%rdi'
 ...

The problem is the detection of the host arch without considering passed in
flags. This change fixes 32-bit builds via:

make EXTRA_CFLAGS=-m32

and 64-bit builds still reference the memcpy_64.S.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310420304-21452-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 13:42:30 +02:00
Jiri Olsa
baf040a0d1 perf tools: Make test use the preset debugfs path
Use preset debugfs path instead of hardcoded one.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-4-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:14 +02:00
Jiri Olsa
13b62567e9 perf tools: Add automated tests for events parsing
Adding builtin test for parse_events function, which is
responsible for parsing/processing "-e" option for
stat/top/record commands.

This new test will run within the builtin test command suite
(perf test).

One or several tests were added for each type of event.
More tests could be added easily if needed.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-3-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:13 +02:00
Jiri Olsa
f120f9d51b perf tools: De-opt the parse_events function
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.

The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:11 +02:00
David Ahern
adc4bf9955 perf script: Fix display of IP address for non-callchain path
Non-callchain path is using al.addr which prints as:
  openssl 14564 17672.003587:       7862d _x86_64_AES_encrypt_compact

This should be sample->ip to print as:
  openssl 14564 17672.003587:  3f7867862d _x86_64_AES_encrypt_compact

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1306768587-15376-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:09:28 +02:00
David Ahern
eda3913bb7 perf tools: Fix endian conversion reading event attr from file header
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.

With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.

-v2:
 - changed the existing perf_event__attr_swap() to swap only elements
   of perf_event_attr and exported it for use in swapping the
   attributes in the file header
 - updated swap_ops used for processing events

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:57:36 +02:00
Jiri Olsa
0111919da2 perf tools: Add missing 'node' alias to the hw_cache[] array
Add "node" as a simple alias for NODE cache events.

The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:

  # ./perf stat -e krava ls

The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.

Adding read/write/prefetch as allowed operations for node cache
event.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:54:51 +02:00
Masami Hiramatsu
14a8fd7cee perf probe: Support adding probes on offline kernel modules
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.

E.g)
 # perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
   -a "extent_io_init:5 extent_state_cache"
 Add new events:
   probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
   probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)

 You can now use it on all perf tools, such as:

         perf record -e probe:extent_io_init_1 -aR sleep 1

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:25:12 -04:00
Masami Hiramatsu
190b57fcb9 perf probe: Add probed module in front of function
Add probed module name and ":" in front of function name
if -m module option is given. In the result, the symbol
name passed to kprobe-tracer becomes MODULE:FUNCTION,
so that kallsyms can solve it as a symbol in the module
correctly.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072745.6528.26416.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:19:08 -04:00
Masami Hiramatsu
ff74178350 perf probe: Introduce debuginfo to encapsulate dwarf information
Introduce debuginfo to encapsulate dwarf information.
This new object allows us to reuse and expand debuginfo easily.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072739.6528.12438.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:14:19 -04:00
Masami Hiramatsu
e0d153c690 perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
Move dwarf library related routines to dwarf-aux.{c,h}.
This includes several minor changes.
- Add simple documents for each API.
- Rename die_find_real_subprogram() to die_find_realfunc()
- Rename line_walk_handler_t to line_walk_callback_t.
- Minor cleanups.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072727.6528.57647.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:10:17 -04:00
Masami Hiramatsu
bcfc082150 perf probe: Remove redundant dwarf functions
Since there are dwarf_bitsize, dwarf_bitoffset and dwarf_bytesize
defined in libdw, we don't need die_get_bit_size, die_get_bit_offset
and die_get_byte_size anymore.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072721.6528.2747.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:04:47 -04:00
Masami Hiramatsu
bad03ae476 perf probe: Move strtailcmp to string.c
Since strtailcmp() is enough generic, it should be defined in string.c.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072715.6528.10677.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:00:47 -04:00
Masami Hiramatsu
baad2d3e69 perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 15:55:57 -04:00
Sonny Rao
259032bfe3 perf: Robustify proc and debugfs file recording
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded.  This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file.  It also simplifies the code somewhat.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-14 15:53:01 -04:00
Anton Blanchard
5d67be97f8 perf report/annotate/script: Add option to specify a CPU range
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

 v2: Incorporate suggestions from David Ahern:
	- Added -c to perf script
	- Check that SAMPLE_CPU is set when -c is used
	- Update documentation

 v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 10:44:44 +02:00
Zhengyu He
3ae9a34d74 perf stat: Add noise output for csv mode
Previously, when you want perf-stat to output the statistics in
csv mode, no information of the noise will be printed out.

For example right now we output this --repeat information:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock
 8,context-switches
 0,CPU-migrations
 219,page-faults
 3337800,cycles

With this patch, the output will be appended with an additional
entry for the noise value:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock,3.75%
 8,context-switches,75.00%
 0,CPU-migrations,100.00%
 219,page-faults,0.00%
 3337800,cycles,3.36%

Signed-off-by: Zhengyu He <zhengyuh@google.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 12:52:40 +02:00
Ingo Molnar
343a031f3c Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2011-07-01 11:51:58 +02:00
Ingo Molnar
10e6962765 Merge commit 'v3.0-rc5' into perf/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 10:28:46 +02:00
Frederic Weisbecker
cb1955b86c perf tools: Only display parent field if explictly sorted
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").

However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.

Result with:

  perf report -p kernel_thread -s parent

Before:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helpe

After:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%  kernel_thread_helper
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helper

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:49 +02:00
Frederic Weisbecker
fd8ea21276 perf tools: Allow sort dimensions to be registered more than once
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).

We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:41 +02:00
Frederic Weisbecker
e84d21227c perf tools: Don't display ignored entries on stdio ui
As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:33 +02:00
Frederic Weisbecker
2fd701bc78 perf tools: Remove sort print helpers declarations
These are probably some old leftovers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:19 +02:00
Frederic Weisbecker
872a878fb1 perf tools: Make sort operations static
These don't need to be globally visible.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:25:12 +02:00