Pull tracing updates from Steven Rostedt:
"This release cycle is rather small. Just a few fixes to tracing.
The big change is the addition of the hwlat tracer. It not only
detects SMIs, but also other latency that's caused by the hardware. I
have detected some latency from large boxes having bus contention"
* tag 'trace-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Call traceoff trigger after event is recorded
ftrace/scripts: Add helper script to bisect function tracing problem functions
tracing: Have max_latency be defined for HWLAT_TRACER as well
tracing: Add NMI tracing in hwlat detector
tracing: Have hwlat trace migrate across tracing_cpumask CPUs
tracing: Add documentation for hwlat_detector tracer
tracing: Added hardware latency tracer
ftrace: Access ret_stack->subtime only in the function profiler
function_graph: Handle TRACE_BPUTS in print_graph_comment
tracing/uprobe: Drop isdigit() check in create_trace_uprobe
Currently ftrace_graph_ent{,_entry} and ftrace_graph_ret{,_entry} struct
can have padding bytes at the end due to alignment in 64-bit data type.
As these data are recorded so frequently, those paddings waste
non-negligible space. As the ring buffer maintains alignment properly
for each architecture, just to remove the extra padding using 'packed'
attribute.
ftrace_graph_ent_entry: 24 -> 20
ftrace_graph_ret_entry: 48 -> 44
Also I moved the 'overrun' field in struct ftrace_graph_ret to minimize
the padding in the middle.
Tested on x86_64 only.
Link: http://lkml.kernel.org/r/1467197808-13578-1-git-send-email-namhyung@kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull livepatching updates from Jiri Kosina:
- remove of our own implementation of architecture-specific relocation
code and leveraging existing code in the module loader to perform
arch-dependent work, from Jessica Yu.
The relevant patches have been acked by Rusty (for module.c) and
Heiko (for s390).
- live patching support for ppc64le, which is a joint work of Michael
Ellerman and Torsten Duwe. This is coming from topic branch that is
share between livepatching.git and ppc tree.
- addition of livepatching documentation from Petr Mladek
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
livepatch: make object/func-walking helpers more robust
livepatch: Add some basic livepatch documentation
powerpc/livepatch: Add live patching support on ppc64le
powerpc/livepatch: Add livepatch stack to struct thread_info
powerpc/livepatch: Add livepatch header
livepatch: Allow architectures to specify an alternate ftrace location
ftrace: Make ftrace_location_range() global
livepatch: robustify klp_register_patch() API error checking
Documentation: livepatch: outline Elf format and requirements for patch modules
livepatch: reuse module loader code to write relocations
module: s390: keep mod_arch_specific for livepatch modules
module: preserve Elf information for livepatch modules
Elf: add livepatch-specific Elf constants
In order to support live patching on powerpc we would like to call
ftrace_location_range(), so make it global.
Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.
Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The preempt_disable() invokes preempt_count_add() which saves the caller
in ->preempt_disable_ip. It uses CALLER_ADDR1 which does not look for
its caller but for the parent of the caller. Which means we get the correct
caller for something like spin_lock() unless the architectures inlines
those invocations. It is always wrong for preempt_disable() or
local_bh_disable().
This patch makes the function get_lock_parent_ip() which tries
CALLER_ADDR0,1,2 if the former is a locking function.
This seems to record the preempt_disable() caller properly for
preempt_disable() itself as well as for get_cpu_var() or
local_bh_disable().
Steven asked for the get_parent_ip() -> get_lock_parent_ip() rename.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160226135456.GB18244@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull livepatching fixes from Jiri Kosina:
- regression (from 4.4) fix for ordering issue, introduced by an
earlier ftrace change, that broke live patching of modules.
The fix replaces the ftrace module notifier by direct call in order
to make the ordering guaranteed and well-defined. The patch, from
Jessica Yu, has been acked both by Steven and Rusty
- error message fix from Miroslav Benes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
ftrace/module: remove ftrace module notifier
livepatch: change the error message in asm/livepatch.h header files
Remove the ftrace module notifier in favor of directly calling
ftrace_module_enable() and ftrace_release_mod() in the module loader.
Hard-coding the function calls directly in the module loader removes
dependence on the module notifier call chain and provides better
visibility and control over what gets called when, which is important
to kernel utilities such as livepatch.
This fixes a notifier ordering issue in which the ftrace module notifier
(and hence ftrace_module_enable()) for coming modules was being called
after klp_module_notify(), which caused livepatch modules to initialize
incorrectly. This patch removes dependence on the module notifier call
chain in favor of hard coding the corresponding function calls in the
module loader. This ensures that ftrace and livepatch code get called in
the correct order on patch module load and unload.
Fixes: 5156dca34a ("ftrace: Fix the race between ftrace and insmod")
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Pull minor tracing fixes from Steven Rostedt:
"This includes three minor fixes, mostly due to cut-and-paste issues.
The first is a cut and paste issue that changed the amount of stack to
skip when tracing a stack dump from 0 to 6, which basically made the
stack disappear for small stack traces.
The second fix is just removing an unused field in a struct that is no
longer used, and currently just wastes space.
The third is another cut-and-paste fix that had a tracepoint recording
the wrong field (it was recording the previous field a second time)"
* tag 'trace-v4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing/dma-buf/fence: Fix timeline str value on fence_annotate_wait_on
ftrace: Remove unused nr_trampolines var
tracing: Fix stacktrace skip depth in trace_buffer_unlock_commit_regs()
Pull tracing updates from Steven Rostedt:
"Not much new with tracing for this release. Mostly just clean ups and
minor fixes.
Here's what else is new:
- A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for
those that want both.
- New selftest to test the instance create and delete
- Better debug output when ftrace fails"
* tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits)
ftrace: Fix the race between ftrace and insmod
ftrace: Add infrastructure for delayed enabling of module functions
x86: ftrace: Fix the comments for ftrace_modify_code_direct()
tracing: Fix comment to use tracing_on over tracing_enable
metag: ftrace: Fix the comments for ftrace_modify_code
sh: ftrace: Fix the comments for ftrace_modify_code()
ia64: ftrace: Fix the comments for ftrace_modify_code()
ftrace: Clean up ftrace_module_init() code
ftrace: Join functions ftrace_module_init() and ftrace_init_module()
tracing: Introduce TRACE_EVENT_FN_COND macro
tracing: Use seq_buf_used() in seq_buf_to_user() instead of len
bpf: Constify bpf_verifier_ops structure
ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
ftrace: Remove use of control list and ops
ftrace: Fix output of enabled_functions for showing tramp
ftrace: Fix a typo in comment
ftrace: Show all tramps registered to a record on ftrace_bug()
ftrace: Add variable ftrace_expected for archs to show expected code
ftrace: Add new type to distinguish what kind of ftrace_bug()
tracing: Update cond flag when enabling or disabling a trigger
...
Qiu Peiyang pointed out that there's a race when enabling function tracing
and loading a module. In order to make the modifications of converting nops
in the prologue of functions into callbacks, the text needs to be converted
from read-only to read-write. When enabling function tracing, the text
permission is updated, the functions are modified, and then they are put
back.
When loading a module, the updates to convert function calls to mcount is
done before the module text is set to read-only. But after it is done, the
module text is visible by the function tracer. Thus we have the following
race:
CPU 0 CPU 1
----- -----
start function tracing
set text to read-write
load_module
add functions to ftrace
set module text read-only
update all functions to callbacks
modify module functions too
< Can't it's read-only >
When this happens, ftrace detects the issue and disables itself till the
next reboot.
To fix this, a new DISABLED flag is added for ftrace records, which all
module functions get when they are added. Then later, after the module code
is all set, the records will have the DISABLED flag cleared, and they will
be enabled if any callback wants all functions to be traced.
Note, this doesn't add the delay to later. It simply changes the
ftrace_module_init() to do both the setting of DISABLED records, and then
immediately calls the enable code. This helps with testing this new code as
it has the same behavior as previously. Another change will come after this
to have the ftrace_module_enable() called after the text is set to
read-only.
Cc: Qiu Peiyang <peiyangx.qiu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
If the module init code fails after calling ftrace_module_init() and before
calling do_init_module(), we can suffer from a memory leak. This is because
ftrace_module_init() allocates pages to store the locations that ftrace
hooks are placed in the module text. If do_init_module() fails, it still
calls the MODULE_GOING notifiers which will tell ftrace to do a clean up of
the pages it allocated for the module. But if load_module() fails before
then, the pages allocated by ftrace_module_init() will never be freed.
Call ftrace_release_mod() on the module if load_module() fails before
getting to do_init_module().
Link: http://lkml.kernel.org/r/567CEA31.1070507@intel.com
Reported-by: "Qiu, PeiyangX" <peiyangx.qiu@intel.com>
Fixes: a949ae560a "ftrace/module: Hardcode ftrace_module_init() call into load_module()"
Cc: stable@vger.kernel.org # v2.6.38+
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Currently perf has its own list function within the ftrace infrastructure
that seems to be used only to allow for it to have per-cpu disabling as well
as a check to make sure that it's not called while RCU is not watching. It
uses something called the "control_ops" which is used to iterate over ops
under it with the control_list_func().
The problem is that this control_ops and control_list_func unnecessarily
complicates the code. By replacing FTRACE_OPS_FL_CONTROL with two new flags
(FTRACE_OPS_FL_RCU and FTRACE_OPS_FL_PER_CPU) we can remove all the code
that is special with the control ops and add the needed checks within the
generic ftrace_list_func().
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
When an anomaly is found while modifying function code, ftrace_bug() is
called which disables the function tracing infrastructure and reports
information about what failed. If the code that is to be replaced does not
match what is expected, then actual code is shown. Currently there is no
arch generic way to show what was expected.
Add a new variable pointer calld ftrace_expected that the arch code can set
to point to what it expected so that ftrace_bug() can report the actual text
as well as the text that was expected to be there.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The ftrace function hook utility has several internal checks to make sure
that whatever it modifies is exactly what it expects to be modifying. This
is essential as modifying running code can be extremely dangerous to the
system.
When an anomaly is detected, ftrace_bug() is called which sends a splat to
the console and disables function tracing. There's some extra information
that is printed to help diagnose the issue.
One thing that is missing though is output of what ftrace was doing at the
time of the crash. Was it updating a call site or perhaps converting a call
site to a nop? A new global enum variable is created to state what ftrace
was doing at the time of the anomaly, and this is reported in ftrace_bug().
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Now that max_stack_lock is a global variable, it requires a naming
convention that is unlikely to collide. Rename it to the same naming
convention that the other stack_trace variables have.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
A stack frame may be used in a different way depending on cpu architecture.
Thus it is not always appropriate to slurp the stack contents, as current
check_stack() does, in order to calcurate a stack index (height) at a given
function call. At least not on arm64.
In addition, there is a possibility that we will mistakenly detect a stale
stack frame which has not been overwritten.
This patch makes check_stack() a weak function so as to later implement
arch-specific version.
Link: http://lkml.kernel.org/r/1446182741-31019-5-git-send-email-takahiro.akashi@linaro.org
Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Commit 4104d326b6 ("ftrace: Remove global function list and call function
directly") simplified the ftrace code by removing the global_ops list with a
new design. But this cleanup also broke the filtering of PIDs that are added
to the set_ftrace_pid file.
Add back the proper hooks to have pid filtering working once again.
Cc: stable@vger.kernel.org # 3.16+
Reported-by: Matt Fleming <matt@console-pimps.org>
Reported-by: Richard Weinberger <richard.weinberger@gmail.com>
Tested-by: Matt Fleming <matt@console-pimps.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Add the kernel command line tp_printk option that will have tracepoints
that are active sent to printk() as well as to the trace buffer.
Passing "tp_printk" will activate this. To turn it off, the sysctl
/proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note,
this only works if the cmdline option is used. Echoing 1 into the sysctl
file without the cmdline option will have no affect.
Note, this is a dangerous option. Having high frequency tracepoints send
their data to printk() can possibly cause a live lock. This is another
reason why this is only active if the command line option is used.
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>