Commit Graph

272 Commits

Author SHA1 Message Date
Ingo Molnar b00560f2d4 Merge branch 'perf/urgent' into perf/core
Merge reason: we need to queue up dependent patch

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:27:23 +01:00
Peter Zijlstra 4fe757dd48 perf: Fix throttle logic
It was possible to call pmu::start() on an already running event. In
particular this lead so some wreckage as the hrtimer events would
re-initialize active timers.

This was due to throttled events being activated again by scheduling.
Scheduling in a context would add and force start events, resulting in
running events with a possible throttle status. The next tick to hit
that task will then try to unthrottle the event and call ->start() on
an already running event.

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:25:29 +01:00
Ingo Molnar c7f9a6f377 Merge branch 'linus' into perf/core
Merge reason: Pick up perf fixes that are now upstream

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-07 08:44:26 +01:00
Peter Zijlstra 542e72fc90 perf: Fix reading in perf_event_read()
It is quite possible for the event to have been disabled between
perf_event_read() sending the IPI and the CPU servicing the IPI and
calling __perf_event_read(), hence revalidate the state.

Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-03 12:15:46 +01:00
Peter Zijlstra fe4b04fa31 perf: Cure task_oncpu_function_call() races
Oleg reported that on architectures with
__ARCH_WANT_INTERRUPTS_ON_CTXSW the IPI from
task_oncpu_function_call() can land before perf_event_task_sched_in()
and cause interesting situations for eg. perf_install_in_context().

This patch reworks the task_oncpu_function_call() interface to give a
more usable primitive as well as rework all its users to hopefully be
more obvious as well as remove the races.

While looking at the code I also found a number of races against
perf_event_task_sched_out() which can flip contexts between tasks so
plug those too.

Reported-and-reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-03 12:14:43 +01:00
Eric Dumazet 88d4f0db7f perf: Fix alloc_callchain_buffers()
Commit 927c7a9e92 ("perf: Fix race in callchains") introduced
a mismatch in the sizing of struct callchain_cpus_entries.

nr_cpu_ids must be used instead of num_possible_cpus(), or we
might get out of bound memory accesses on some machines.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Stephane Eranian <eranian@google.com>
CC: stable@kernel.org
LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-27 19:21:50 +01:00
Oleg Nesterov 806839b22c perf: perf_event_exit_task_context: s/rcu_dereference/rcu_dereference_raw/
In theory, almost every user of task->child->perf_event_ctxp[]
is wrong. find_get_context() can install the new context at any
moment, we need read_barrier_depends().

dbe08d82ce "perf: Fix
find_get_context() vs perf_event_exit_task() race" added
rcu_dereference() into perf_event_exit_task_context() to make
the precedent, but this makes __rcu_dereference_check() unhappy.
Use rcu_dereference_raw() to shut up the warning.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: acme@redhat.com
Cc: paulus@samba.org
Cc: stern@rowland.harvard.edu
Cc: a.p.zijlstra@chello.nl
Cc: fweisbec@gmail.com
Cc: roland@redhat.com
Cc: prasad@linux.vnet.ibm.com
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
LKML-Reference: <20110121174547.GA8796@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-21 22:08:16 +01:00
Peter Zijlstra 547e9fd7d3 perf: Annotate cpuctx->ctx.mutex to avoid a lockdep splat
Lockdep spotted:

	loop_1b_instruc/1899 is trying to acquire lock:
	 (event_mutex){+.+.+.}, at: [<ffffffff810e1908>] perf_trace_init+0x3b/0x2f7

	but task is already holding lock:
	 (&ctx->mutex){+.+.+.}, at: [<ffffffff810eb45b>] perf_event_init_context+0xc0/0x218

	which lock already depends on the new lock.

	the existing dependency chain (in reverse order) is:

	-> #3 (&ctx->mutex){+.+.+.}:
	-> #2 (cpu_hotplug.lock){+.+.+.}:
	-> #1 (module_mutex){+.+...}:
	-> #0 (event_mutex){+.+.+.}:

But because the deadlock would be cpuhotplug (cpu-event) vs fork
(task-event) it cannot, in fact, happen. We can annotate this by giving the
perf_event_context used for the cpuctx a different lock class from those
used by tasks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-21 16:32:42 +01:00
Oleg Nesterov 8550d7cb6e perf: Fix perf_event_init_task()/perf_event_free_task() interaction
perf_event_init_task() should clear child->perf_event_ctxp[]
before anything else. Otherwise, if
perf_event_init_context(perf_hw_context) fails,
perf_event_free_task() can free perf_event_ctxp[perf_sw_context]
copied from parent->perf_event_ctxp[] by dup_task_struct().

Also move the initialization of perf_event_mutex and
perf_event_list from perf_event_init_context() to
perf_event_init_context().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <20110119182228.GC12183@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-19 20:04:28 +01:00
Oleg Nesterov dbe08d82ce perf: Fix find_get_context() vs perf_event_exit_task() race
find_get_context() must not install the new perf_event_context
if the task has already passed perf_event_exit_task().

If nothing else, this means the memory leak. Initially
ctx->refcount == 2, it is supposed that
perf_event_exit_task_context() should participate and do the
necessary put_ctx().

find_lively_task_by_vpid() checks PF_EXITING but this buys
nothing, by the time we call find_get_context() this task can be
already dead. To the point, cmpxchg() can succeed when the task
has already done the last schedule().

Change find_get_context() to populate task->perf_event_ctxp[]
under task->perf_event_mutex, this way we can trust PF_EXITING
because perf_event_exit_task() takes the same mutex.

Also, change perf_event_exit_task_context() to use
rcu_dereference(). Probably this is not strictly needed, but
with or without this change find_get_context() can race with
setup_new_exec()->perf_event_exit_task(), rcu_dereference()
looks better.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
LKML-Reference: <20110119182207.GB12183@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-19 20:04:27 +01:00
Linus Torvalds 335bc70b6b Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Validate cpu early in perf_event_alloc()
  perf: Find_get_context: fix the per-cpu-counter check
  perf: Fix contexted inheritance
2011-01-18 14:29:37 -08:00
Oleg Nesterov 66832eb4ba perf: Validate cpu early in perf_event_alloc()
Starting from perf_event_alloc()->perf_init_event(), the kernel
assumes that event->cpu is either -1 or the valid CPU number.

Change perf_event_alloc() to validate this argument early. This
also means we can remove the similar check in
find_get_context().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: gregkh@suse.de
Cc: stable@kernel.org
LKML-Reference: <20110118161032.GC693@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-18 19:34:23 +01:00
Oleg Nesterov 22a4ec7290 perf: Find_get_context: fix the per-cpu-counter check
If task == NULL, find_get_context() should always check that cpu
is correct.

Afaics, the bug was introduced by 38a81da2 "perf events: Clean
up pid passing", but even before that commit "&& cpu != -1" was
not exactly right, -ESRCH from find_task_by_vpid() is not
accurate.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Prasad <prasad@linux.vnet.ibm.com>
Cc: Roland McGrath <roland@redhat.com>
Cc: gregkh@suse.de
Cc: stable@kernel.org
LKML-Reference: <20110118161008.GB693@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-18 19:34:23 +01:00
Peter Zijlstra c5ed514559 perf: Fix contexted inheritance
Linus reported that the RCU lockdep annotation bits triggered for this
rcu_dereference() because we're not holding rcu_read_lock().

Going over the code I cannot convince myself its correct:

 - holding a ref on the parent_ctx, doesn't avoid it being uncloned
   concurrently (as the comment says), so we can race with a free.

 - holding parent_ctx->mutex doesn't avoid the above free from taking
   place either, it would at best avoid parent_ctx from being freed.

I.e. the warning is correct. To fix the bug, serialize against the
unclone_ctx() call by extending the reach of the parent_ctx->lock.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-18 15:10:35 +01:00
Linus Torvalds 008d23e485 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
  Documentation/trace/events.txt: Remove obsolete sched_signal_send.
  writeback: fix global_dirty_limits comment runtime -> real-time
  ppc: fix comment typo singal -> signal
  drivers: fix comment typo diable -> disable.
  m68k: fix comment typo diable -> disable.
  wireless: comment typo fix diable -> disable.
  media: comment typo fix diable -> disable.
  remove doc for obsolete dynamic-printk kernel-parameter
  remove extraneous 'is' from Documentation/iostats.txt
  Fix spelling milisec -> ms in snd_ps3 module parameter description
  Fix spelling mistakes in comments
  Revert conflicting V4L changes
  i7core_edac: fix typos in comments
  mm/rmap.c: fix comment
  sound, ca0106: Fix assignment to 'channel'.
  hrtimer: fix a typo in comment
  init/Kconfig: fix typo
  anon_inodes: fix wrong function name in comment
  fix comment typos concerning "consistent"
  poll: fix a typo in comment
  ...

Fix up trivial conflicts in:
 - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
 - fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-13 10:05:56 -08:00
Stephane Eranian 4158755d31 perf_events: Add perf_event_time()
Adds perf_event_time() to try and centralize access to event
timing and in particular ctx->time. Prepares for cgroup support.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d22059c.122ae30a.5e0e.ffff8b8b@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:51 +01:00
Stephane Eranian 5632ab12e9 perf_events: Generalize use of event_filter_match()
Replace all occurrences of:
	event->cpu != -1 && event->cpu == smp_processor_id()
by a call to:
	event_filter_match(event)

This makes the code more consistent and will make the cgroup
patch smaller.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d220593.2308e30a.48c5.ffff8ae9@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:51 +01:00
Stephane Eranian 0b3fcf178d perf_events: Move code around to prepare for cgroup
In particular this patch move perf_event_exit_task() before
cgroup_exit() to allow for cgroup support. The cgroup_exit()
function detaches the cgroups attached to a task.

Other movements include hoisting some definitions and inlines
at the top of perf_event.c

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d22058b.cdace30a.4657.ffff95b1@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-07 15:08:50 +01:00
Jiri Kosina 4b7bd36470 Merge branch 'master' into for-next
Conflicts:
	MAINTAINERS
	arch/arm/mach-omap2/pm24xx.c
	drivers/scsi/bfa/bfa_fcpim.c

Needed to update to apply fixes for which the old branch was too
outdated.
2010-12-22 18:57:02 +01:00
Peter Zijlstra abe4340057 perf: Sysfs enumeration
Simple sysfs emumeration of the PMUs.

Use a "event_source" bus, and add PMU devices using their name.

Each PMU device has a type attribute which contrains the value needed
for perf_event_attr::type to identify this PMU.

This is the minimal stub needed to start using this interface,
we'll consider extending the sysfs usage later.

Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Greg KH <gregkh@suse.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.316982569@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:36:43 +01:00
Peter Zijlstra 2e80a82a49 perf: Dynamic pmu types
Extend the perf_pmu_register() interface to allow for named and
dynamic pmu types.

Because we need to support the existing static types we cannot use
dynamic types for everything, hence provide a type argument.

If we want to enumerate the PMUs they need a name, provide one.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101117222056.259707703@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:36:43 +01:00
Ingo Molnar 006b20fe4c Merge branch 'perf/urgent' into perf/core
Merge reason: We want to apply a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:22:27 +01:00
Dan Carpenter ce677831a4 perf: Fix off by one in perf_swevent_init()
The perf_swevent_enabled[] array has PERF_COUNT_SW_MAX elements.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101024195041.GT5985@bicker>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:14:31 +01:00
Peter Zijlstra c277443cfc perf: Stop all counters on reboot
Use the reboot notifier to detach all running counters on reboot, this
solves a problem with kexec where the new kernel doesn't expect
running counters (rightly so).

It will however decrease the coverage of the NMI watchdog. Making a
kexec specific reboot notifier callback would be best, however that
would require touching all notifier callback handlers as they are not
properly structured to deal with new state.

As a compromise, place the perf reboot notifier at the very last
position in the list.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-08 20:16:31 +01:00
Peter Zijlstra 5167695753 perf: Fix duplicate events with multiple-pmu vs software events
Because the multi-pmu bits can share contexts between struct pmu
instances we could get duplicate events by iterating the pmu list.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-08 20:14:08 +01:00