Commit Graph

15111 Commits

Author SHA1 Message Date
Ingo Molnar
3bf2391729 Merge branch 'perf/urgent' into perf/core
Merge in all pending fixes, before pulling the latest development
bits from Arnaldo - which will involve merge conflicts.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-21 11:03:10 +01:00
Namhyung Kim
86e213e1d9 perf/cgroup: Add __percpu annotation to perf_cgroup->info
It's a per-cpu data structure but missed the __percpu annotation.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Link: http://lkml.kernel.org/r/1363600594-11453-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 11:02:06 +01:00
Ingo Molnar
1f1b396758 Merge branch 'tip/perf/urgent-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull tracing fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 09:48:29 +01:00
Namhyung Kim
d610d98b5d perf: Generate EXIT event only once per task context
perf_event_task_event() iterates pmu list and generate events
for each eligible pmu context.  But if task_event has task_ctx
like in EXIT it'll generate events even though the pmu doesn't
have an eligible one. Fix it by moving the code to proper
places.

Before this patch:

  $ perf record -n true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.006 MB perf.data (~248 samples) ]

  $ perf report -D | tail
  Aggregated stats:
             TOTAL events:         73
              MMAP events:         67
              COMM events:          2
              EXIT events:          4
  cycles stats:
             TOTAL events:         73
              MMAP events:         67
              COMM events:          2
              EXIT events:          4

After this patch:

  $ perf report -D | tail
  Aggregated stats:
             TOTAL events:         70
              MMAP events:         67
              COMM events:          2
              EXIT events:          1
  cycles stats:
             TOTAL events:         70
              MMAP events:         67
              COMM events:          2
              EXIT events:          1

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1363332433-7637-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 09:47:33 +01:00
Namhyung Kim
778141e3cf perf: Reset hwc->last_period on sw clock events
When cpu/task clock events are initialized, their sampling
frequencies are converted to have a fixed value.  However it
missed to update the hwc->last_period which was set to 1 for
initial sampling frequency calibration.

Because this hwc->last_period value is used as a period in
perf_swevent_ hrtime(), every recorded sample will have an
incorrected period of 1.

  $ perf record -e task-clock noploop 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.158 MB perf.data (~6919 samples) ]

  $ perf report -n --show-total-period  --stdio
  # Samples: 4K of event 'task-clock'
  # Event count (approx.): 4000
  #
  # Overhead       Samples        Period  Command  Shared Object              Symbol
  # ........  ............  ............  .......  .............  ..................
  #
      99.95%          3998          3998  noploop  noploop        [.] main
       0.03%             1             1  noploop  libc-2.15.so   [.] init_cacheinfo
       0.03%             1             1  noploop  ld-2.15.so     [.] open_verify

Note that it doesn't affect the non-sampling event so that the
perf stat still gets correct value with or without this patch.

  $ perf stat -e task-clock noploop 1

   Performance counter stats for 'noploop 1':

         1000.272525 task-clock                #    1.000 CPUs utilized

         1.000560605 seconds time elapsed

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1363574507-18808-1-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-18 09:15:18 +01:00
Steven Rostedt (Red Hat)
613f04a0f5 tracing: Prevent buffer overwrite disabled for latency tracers
The latency tracers require the buffers to be in overwrite mode,
otherwise they get screwed up. Force the buffers to stay in overwrite
mode when latency tracers are enabled.

Added a flag_changed() method to the tracer structure to allow
the tracers to see what flags are being changed, and also be able
to prevent the change from happing.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14 23:40:21 -04:00
Steven Rostedt (Red Hat)
8090282265 tracing: Keep overwrite in sync between regular and snapshot buffers
Changing the overwrite mode for the ring buffer via the trace
option only sets the normal buffer. But the snapshot buffer could
swap with it, and then the snapshot would be in non overwrite mode
and the normal buffer would be in overwrite mode, even though the
option flag states otherwise.

Keep the two buffers overwrite modes in sync.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14 23:40:15 -04:00
Steven Rostedt (Red Hat)
69d34da298 tracing: Protect tracer flags with trace_types_lock
Seems that the tracer flags have never been protected from
synchronous writes. Luckily, admins don't usually modify the
tracing flags via two different tasks. But if scripts were to
be used to modify them, then they could get corrupted.

Move the trace_types_lock that protects against tracers changing
to also protect the flags being set.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-14 13:50:56 -04:00
anish kumar
b66a2356d7 watchdog: Add comments to explain the watchdog_disabled variable
The watchdog_disabled flag is a bit cryptic. However it's
usefulness is multifold. Uses are:

 1. Check if smpboot_register_percpu_thread function passed.

 2. Makes sure that user enables and disables the watchdog in
    sequence i.e. enable watchdog->disable watchdog->enable watchdog
    Unlike enable watchdog->enable watchdog which is wrong.

Signed-off-by: anish kumar <anish198519851985@gmail.com>
[small text cleanups]
Signed-off-by: Don Zickus <dzickus@redhat.com>
Cc: chuansheng.liu@intel.com
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1363113848-18344-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-14 08:24:05 +01:00
Ingo Molnar
0b34083f46 Merge branch 'tip/perf/urgent-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/urgent
Pull tracing fixes from Steven Rostedt.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-14 08:12:20 +01:00
Linus Torvalds
842d223f28 Merge branch 'akpm' (fixes from Andrew)
Merge misc fixes from Andrew Morton:

 - A bunch of fixes

 - Finish off the idr API conversions before someone starts to use the
   old interfaces again.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  idr: idr_alloc() shouldn't trigger lowmem warning when preloaded
  UAPI: fix endianness conditionals in M32R's asm/stat.h
  UAPI: fix endianness conditionals in linux/raid/md_p.h
  UAPI: fix endianness conditionals in linux/acct.h
  UAPI: fix endianness conditionals in linux/aio_abi.h
  decompressors: fix typo "POWERPC"
  mm/fremap.c: fix oops on error path
  idr: deprecate idr_pre_get() and idr_get_new[_above]()
  tidspbridge: convert to idr_alloc()
  zcache: convert to idr_alloc()
  mlx4: remove leftover idr_pre_get() call
  workqueue: convert to idr_alloc()
  nfsd: convert to idr_alloc()
  nfsd: remove unused get_new_stid()
  kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
  signal: always clear sa_restorer on execve
  mm: remove_memory(): fix end_pfn setting
  include/linux/res_counter.h needs errno.h
2013-03-13 15:21:57 -07:00
Tejun Heo
e68035fb65 workqueue: convert to idr_alloc()
idr_get_new*() and friends are about to be deprecated.  Convert to the
new idr_alloc() interface.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:46 -07:00
Andrew Morton
522cff142d kernel/signal.c: use __ARCH_HAS_SA_RESTORER instead of SA_RESTORER
__ARCH_HAS_SA_RESTORER is the preferred conditional for use in 3.9 and
later kernels, per Kees.

Cc: Emese Revfy <re.emese@gmail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Julien Tinnes <jln@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:45 -07:00
Kees Cook
2ca39528c0 signal: always clear sa_restorer on execve
When the new signal handlers are set up, the location of sa_restorer is
not cleared, leaking a parent process's address space location to
children.  This allows for a potential bypass of the parent's ASLR by
examining the sa_restorer value returned when calling sigaction().

Based on what should be considered "secret" about addresses, it only
matters across the exec not the fork (since the VMAs haven't changed
until the exec).  But since exec sets SIG_DFL and keeps sa_restorer,
this is where it should be fixed.

Given the few uses of sa_restorer, a "set" function was not written
since this would be the only use.  Instead, we use
__ARCH_HAS_SA_RESTORER, as already done in other places.

Example of the leak before applying this patch:

  $ cat /proc/$$/maps
  ...
  7fb9f3083000-7fb9f3238000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
  ...
  $ ./leak
  ...
  7f278bc74000-7f278be29000 r-xp 00000000 fd:01 404469 .../libc-2.15.so
  ...
  1 0 (nil) 0x7fb9f30b94a0
  2 4000000 (nil) 0x7f278bcaa4a0
  3 4000000 (nil) 0x7f278bcaa4a0
  4 0 (nil) 0x7fb9f30b94a0
  ...

[akpm@linux-foundation.org: use SA_RESTORER for backportability]
Signed-off-by: Kees Cook <keescook@chromium.org>
Reported-by: Emese Revfy <re.emese@gmail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: Julien Tinnes <jln@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:21:44 -07:00
Eric W. Biederman
e66eded830 userns: Don't allow CLONE_NEWUSER | CLONE_FS
Don't allowing sharing the root directory with processes in a
different user namespace.  There doesn't seem to be any point, and to
allow it would require the overhead of putting a user namespace
reference in fs_struct (for permission checks) and incrementing that
reference count on practically every call to fork.

So just perform the inexpensive test of forbidding sharing fs_struct
acrosss processes in different user namespaces.  We already disallow
other forms of threading when unsharing a user namespace so this
should be no real burden in practice.

This updates setns, clone, and unshare to disallow multiple user
namespaces sharing an fs_struct.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-13 15:00:20 -07:00
Steven Rostedt (Red Hat)
740466bc89 tracing: Fix free of probe entry by calling call_rcu_sched()
Because function tracing is very invasive, and can even trace
calls to rcu_read_lock(), RCU access in function tracing is done
with preempt_disable_notrace(). This requires a synchronize_sched()
for updates and not a synchronize_rcu().

Function probes (traceon, traceoff, etc) must be freed after
a synchronize_sched() after its entry has been removed from the
hash. But call_rcu() is used. Fix this by using call_rcu_sched().

Also fix the usage to use hlist_del_rcu() instead of hlist_del().

Cc: stable@vger.kernel.org
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-13 17:57:44 -04:00
Randy Dunlap
6c23cbbd50 futex: fix kernel-doc notation and spello
Fix kernel-doc warning in futex.c and convert 'Returns' to the new Return:
kernel-doc notation format.

  Warning(kernel/futex.c:2286): Excess function parameter 'clockrt' description in 'futex_wait_requeue_pi'

Fix one spello.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12 20:42:10 -07:00
Randy Dunlap
20f22ab42e signals: fix new kernel-doc warnings
Fix new kernel-doc warnings in kernel/signal.c:

  Warning(kernel/signal.c:2689): No description found for parameter 'uset'
  Warning(kernel/signal.c:2689): Excess function parameter 'set' description in 'sys_rt_sigpending'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-12 20:42:10 -07:00
Steven Rostedt (Red Hat)
2721e72dd1 tracing: Fix race in snapshot swapping
Although the swap is wrapped with a spin_lock, the assignment
of the temp buffer used to swap is not within that lock.
It needs to be moved into that lock, otherwise two swaps
happening on two different CPUs, can end up using the wrong
temp buffer to assign in the swap.

Luckily, all current callers of the swap function appear to have
their own locks. But in case something is added that allows two
different callers to call the swap, then there's a chance that
this race can trigger and corrupt the buffers.

New code is coming soon that will allow for this race to trigger.

I've Cc'd stable, so this bug will not show up if someone backports
one of the changes that can trigger this bug.

Cc: stable@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-12 11:56:33 -04:00
Linus Torvalds
7c6baa304b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc minor fixes mostly related to tracing"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  s390: Fix a header dependencies related build error
  tracing: update documentation of snapshot utility
  tracing: Do not return EINVAL in snapshot when not allocated
  tracing: Add help of snapshot feature when snapshot is empty
  ftrace: Update the kconfig for DYNAMIC_FTRACE
2013-03-11 07:54:29 -07:00
Arnd Bergmann
dc893e19b5 Revert parts of "hlist: drop the node parameter from iterators"
Commit b67bfe0d42 ("hlist: drop the node parameter from iterators")
did a lot of nice changes but also contains two small hunks that seem to
have slipped in accidentally and have no apparent connection to the
intent of the patch.

This reverts the two extraneous changes.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Senna Tschudin <peter.senna@gmail.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-03-08 15:05:34 -08:00
Steven Rostedt (Red Hat)
c9960e4854 tracing: Do not return EINVAL in snapshot when not allocated
To use the tracing snapshot feature, writing a '1' into the snapshot
file causes the snapshot buffer to be allocated if it has not already
been allocated and dose a 'swap' with the main buffer, so that the
snapshot now contains what was in the main buffer, and the main buffer
now writes to what was the snapshot buffer.

To free the snapshot buffer, a '0' is written into the snapshot file.

To clear the snapshot buffer, any number but a '0' or '1' is written
into the snapshot file. But if the file is not allocated it returns
-EINVAL error code. This is rather pointless. It is better just to
do nothing and return success.

Acked-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 10:31:38 -05:00
Steven Rostedt (Red Hat)
d8741e2e88 tracing: Add help of snapshot feature when snapshot is empty
When cat'ing the snapshot file, instead of showing an empty trace
header like the trace file does, show how to use the snapshot
feature.

Also, this is a good place to show if the snapshot has been allocated
or not. Users may want to "pre allocate" the snapshot to have a fast
"swap" of the current buffer. Otherwise, a swap would be slow and might
fail as it would need to allocate the snapshot buffer, and that might
fail under tight memory constraints.

Here's what it looked like before:

 # tracer: nop
 #
 # entries-in-buffer/entries-written: 0/0   #P:4
 #
 #                              _-----=> irqs-off
 #                             / _----=> need-resched
 #                            | / _---=> hardirq/softirq
 #                            || / _--=> preempt-depth
 #                            ||| /     delay
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |

Here's what it looks like now:

 # tracer: nop
 #
 #
 # * Snapshot is freed *
 #
 # Snapshot commands:
 # echo 0 > snapshot : Clears and frees snapshot buffer
 # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
 #                      Takes a snapshot of the main buffer.
 # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
 #                      (Doesn't have to be '2' works with any number that
 #                       is not a '0' or '1')

Acked-by: Hiraku Toyooka <hiraku.toyooka.gu@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-07 10:31:22 -05:00
Li Zefan
877c685607 perf: Remove include of cgroup.h from perf_event.h
Move struct perf_cgroup_info and perf_cgroup to
kernel/perf/core.c, and then we can remove include of cgroup.h.

Signed-off-by: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/513568A0.6020804@huawei.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-03-06 11:32:56 +01:00
Linus Torvalds
e3b59518c1 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes and cleanups from Thomas Gleixner:
 "Commit e5ab012c32 ("nohz: Make tick_nohz_irq_exit() irq safe") is
  the first commit in the series and the minimal necessary bugfix, which
  needs to go back into stable.

  The remanining commits enforce irq disabling in irq_exit(), sanitize
  the hardirq/softirq preempt count transition and remove a bunch of no
  longer necessary conditionals."

I personally love getting rid of the very subtle and confusing
IRQ_EXIT_OFFSET thing.  Even apart from the whole "more lines removed
than added" thing.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irq: Don't re-enable interrupts at the end of irq_exit
  irq: Remove IRQ_EXIT_OFFSET workaround
  Revert "nohz: Make tick_nohz_irq_exit() irq safe"
  irq: Sanitize invoke_softirq
  irq: Ensure irq_exit() code runs with interrupts disabled
  nohz: Make tick_nohz_irq_exit() irq safe
2013-03-05 18:10:04 -08:00