Impact: add way to quickly start stop tracing from the kernel
This patch adds a soft stop and start to the trace. This simply
disables function tracing via the ftrace_disabled flag, and
disables the trace buffers to prevent recording. The tracing
code may still be executed, but the trace will not be recorded.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: quick start and stop of function tracer
This patch adds a way to disable the function tracer quickly without
the need to run kstop_machine. It adds a new variable called
function_trace_stop which will stop the calls to functions from mcount
when set. This is just an on/off switch and does not handle recursion
like preempt_disable().
It's main purpose is to help other tracers/debuggers start and stop tracing
fuctions without the need to call kstop_machine.
The config option HAVE_FUNCTION_TRACE_MCOUNT_TEST is added for archs
that implement the testing of the function_trace_stop in the mcount
arch dependent code. Otherwise, the test is done in the C code.
x86 is the only arch at the moment that supports this.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix boot tracer + sched tracer coupling bug
Fix a bug that made the sched_switch tracer unable to run
if set as the current_tracer after the boot tracer.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
This patch applies some corrections suggested by Steven Rostedt.
Change the type of shed_ref into int since it is used
into a Mutex, we don't need it anymore as an atomic
variable in the sched_switch tracer.
Also change the name of the register mutex.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: enhance boot trace output with scheduling events
Use the sched_switch tracer from the boot tracer.
We also can trace schedule events inside the initcalls.
Sched tracing is disabled after the initcall has finished and
then reenabled before the next one is started.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: cleanup
When init_sched_switch_trace() is called, it has no reason to start
the sched tracer if the sched_ref is not zero.
_ If this is non-zero, the tracer is already used, but we can register it
to the tracing engine. There is already a security which avoid the tracer
probes not to be resgistered twice.
_ If this is zero, this block will not be used.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix race condition in sched_switch tracer
This patch fixes a race condition in the sched_switch tracer. If
several tasks (IE: concurrent initcalls) are playing with
tracing_start_cmdline_record() and tracing_stop_cmdline_record(), the
following situation could happen:
_ Task A and B are using the same tracepoint probe. Task A holds it.
Task B is sleeping and doesn't hold it.
_ Task A frees the sched tracer, then sched_ref is decremented to 0.
_ Task A is preempted and hadn't yet unregistered its tracepoint
probe, then B runs.
_ B increments sched_ref, sees it's 1 and then guess it has to
register its probe. But it has not been freed by task A.
_ A lot of bad things can happen after that...
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: modify boot tracer
We used to disable the initcall tracing at a specified time (IE: end
of builtin initcalls). But we don't need it anymore. It will be
stopped when initcalls are finished.
However we want two things:
_Start this tracing only after pre-smp initcalls are finished.
_Since we are planning to trace sched_switches at the same time, we
want to enable them only during the initcall execution.
For this purpose, this patch introduce two functions to enable/disable
the sched_switch tracing during boot.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: fix sysctl name typo
Steve must have needed more coffee ;-)
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: disable interrupts during trace entry creation (as opposed to preempt)
To help with performance, I set the ftracer to not disable interrupts,
and only to disable preemption. If an interrupt occurred, it would not
be traced, because the function tracer protects itself from recursion.
This may be faster, but the trace output might miss some traces.
This patch makes the fuction trace disable interrupts, but it also
adds a runtime feature to disable preemption instead. It does this by
having two different tracer functions. When the function tracer is
enabled, it will check to see which version is requested (irqs disabled
or preemption disabled). Then it will use the corresponding function
as the tracer.
Irq disabling is the default behaviour, but if the user wants better
performance, with the chance of missing traces, then they can choose
the preempt disabled version.
Running hackbench 3 times with the irqs disabled and 3 times with
the preempt disabled function tracer yielded:
tracing type times entries recorded
------------ -------- ----------------
irq disabled 43.393 166433066
43.282 166172618
43.298 166256704
preempt disabled 38.969 159871710
38.943 159972935
39.325 161056510
Average:
irqs disabled: 43.324 166287462
preempt disabled: 39.079 160300385
preempt is 10.8 percent faster than irqs disabled.
I wrote a patch to count function trace recursion and reran hackbench.
With irq disabled: 1,150 times the function tracer did not trace due to
recursion.
with preempt disabled: 5,117,718 times.
The thousand times with irq disabled could be due to NMIs, or simply a case
where it called a function that was not protected by notrace.
But we also see that a large amount of the trace is lost with the
preempt version.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: use new, consolidated APIs in ftrace plugins
This patch replaces the schedule safe preempt disable code with the
ftrace_preempt_disable() and ftrace_preempt_enable() safe functions.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new ftrace-plugin internal APIs
Parts of the tracer needs to be careful about schedule recursion.
If the NEED_RESCHED flag is set, a preempt_enable will call schedule.
Inside the schedule function, the NEED_RESCHED flag is cleared.
The problem arises when a trace happens in the schedule function but before
NEED_RESCHED is cleared. The race is as follows:
schedule()
>> tracer called
trace_function()
preempt_disable()
[ record trace ]
preempt_enable() <<- here's the issue.
[check NEED_RESCHED]
schedule()
[ Repeat the above, over and over again ]
The naive approach is simply to use preempt_enable_no_schedule instead.
The problem with that approach is that, although we solve the schedule
recursion issue, we now might lose a preemption check when not in the
schedule function.
trace_function()
preempt_disable()
[ record trace ]
[Interrupt comes in and sets NEED_RESCHED]
preempt_enable_no_resched()
[continue without scheduling]
The way ftrace handles this problem is with the following approach:
int resched;
resched = need_resched();
preempt_disable_notrace();
[record trace]
if (resched)
preempt_enable_no_sched_notrace();
else
preempt_enable_notrace();
This may seem like the opposite of what we want. If resched is set
then we call the "no_sched" version?? The reason we do this is because
if NEED_RESCHED is set before we disable preemption, there's two reasons
for that:
1) we are in an atomic code path
2) we are already on our way to the schedule function, and maybe even
in the schedule function, but have yet to clear the flag.
Both the above cases we do not want to schedule.
This solution has already been implemented within the ftrace infrastructure.
But the problem is that it has been implemented several times. This patch
encapsulates this code to two nice functions.
resched = ftrace_preempt_disable();
[ record trace]
ftrace_preempt_enable(resched);
This way the tracers do not need to worry about getting it right.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new tracepoint APIs to allow the batched registration of probes
new APIs separate tracepoint_probe_register(),
tracepoint_probe_unregister() into 2 steps. The first step of them
is just update tracepoint_entry, not connect or disconnect.
this patch introduces tracepoint_probe_update_all() for update all.
these APIs are very useful for registering lots of probes
but just updating once. Another very important thing is that
*_noupdate APIs do not require module_mutex.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: simplify implementation
Now, unused memory is handled by struct tp_probes.
old code use these three field to handle unused memory.
struct tracepoint_entry {
...
struct rcu_head rcu;
void *oldptr;
unsigned char rcu_pending:1;
...
};
in this way, unused memory is handled by struct tracepoint_entry.
it bring reenter bug(it was fixed) and tracepoint.c is filled
full of ".*rcu.*" code statements. this patch removes all these.
and:
rcu_barrier_sched() is removed.
Do not need regain tracepoints_mutex after tracepoint_update_probes()
several little cleanup.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: build fix on !stacktrace architectures
only select STACKTRACE on architectures that have STACKTRACE_SUPPORT
... since we also need to ifdef out the guts of ftrace_trace_stack().
We also want to disallow setting TRACE_ITER_STACKTRACE in trace_flags
on such configs, but that can wait.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: add new (optional) debug boot option
In order to facilitate early boot trouble, allow one to specify a tracer
on the kernel boot line.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Insufficient dependency - we really want CONFIG_RTC_CLASS=y there.
That will give us CONFIG_RTC_LIB=y, so the old dependency can be
simply replaced.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This one apparently doesn't generate any warnings, because the function
is only used during system bootup, when the warnings are disabled. But
it's still very wrong.
The __reserve_region_with_split() function is called with the
resource_lock held for writing, so it must only ever do GFP_ATOMIC
allocations.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: cleanup
This patch cleans up the NMI safe code for dynamic ftrace as suggested
by Andrew Morton.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched: remove sched-design.txt from 00-INDEX
sched: change sched_debug's mode to 0444
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
ftrace: handle archs that do not support irqs_disabled_flags