Merge tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "The majority of the changes are for fixes and clean ups.

  Notable changes:

   - Rework trace event triggers code to be easier to interact with.

   - Support for embedding bootconfig with the kernel (as suppose to
     having it embedded in initram). This is useful for embedded boards
     without initram disks.

   - Speed up boot by parallelizing the creation of tracefs files.

   - Allow absolute ring buffer timestamps handle timestamps that use
     more than 59 bits.

   - Added new tracing clock "TAI" (International Atomic Time)

   - Have weak functions show up in available_filter_function list as:
     __ftrace_invalid_address___<invalid-offset> instead of using the
     name of the function before it"

* tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (52 commits)
  ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function
  tracing: Fix comments for event_trigger_separate_filter()
  x86/traceponit: Fix comment about irq vector tracepoints
  x86,tracing: Remove unused headers
  ftrace: Clean up hash direct_functions on register failures
  tracing: Fix comments of create_filter()
  tracing: Disable kcov on trace_preemptirq.c
  tracing: Initialize integer variable to prevent garbage return value
  ftrace: Fix typo in comment
  ftrace: Remove return value of ftrace_arch_modify_*()
  tracing: Cleanup code by removing init "char *name"
  tracing: Change "char *" string form to "char []"
  tracing/timerlat: Do not wakeup the thread if the trace stops at the IRQ
  tracing/timerlat: Print stacktrace in the IRQ handler if needed
  tracing/timerlat: Notify IRQ new max latency only if stop tracing is set
  kprobes: Fix build errors with CONFIG_KRETPROBES=n
  tracing: Fix return value of trace_pid_write()
  tracing: Fix potential double free in create_var_ref()
  tracing: Use strim() to remove whitespace instead of doing it manually
  ftrace: Deal with error return code of the ftrace_process_locs() function
  ...
This commit is contained in:
Linus Torvalds
2022-05-29 10:31:36 -07:00
40 changed files with 827 additions and 651 deletions

View File

@@ -158,9 +158,15 @@ Each key-value pair is shown in each line with following style::
Boot Kernel With a Boot Config
==============================
Since the boot configuration file is loaded with initrd, it will be added
to the end of the initrd (initramfs) image file with padding, size,
checksum and 12-byte magic word as below.
There are two options to boot the kernel with bootconfig: attaching the
bootconfig to the initrd image or embedding it in the kernel itself.
Attaching a Boot Config to Initrd
---------------------------------
Since the boot configuration file is loaded with initrd by default,
it will be added to the end of the initrd (initramfs) image file with
padding, size, checksum and 12-byte magic word as below.
[initrd][bootconfig][padding][size(le32)][checksum(le32)][#BOOTCONFIG\n]
@@ -196,6 +202,25 @@ To remove the config from the image, you can use -d option as below::
Then add "bootconfig" on the normal kernel command line to tell the
kernel to look for the bootconfig at the end of the initrd file.
Embedding a Boot Config into Kernel
-----------------------------------
If you can not use initrd, you can also embed the bootconfig file in the
kernel by Kconfig options. In this case, you need to recompile the kernel
with the following configs::
CONFIG_BOOT_CONFIG_EMBED=y
CONFIG_BOOT_CONFIG_EMBED_FILE="/PATH/TO/BOOTCONFIG/FILE"
``CONFIG_BOOT_CONFIG_EMBED_FILE`` requires an absolute path or a relative
path to the bootconfig file from source tree or object tree.
The kernel will embed it as the default bootconfig.
Just as when attaching the bootconfig to the initrd, you need ``bootconfig``
option on the kernel command line to enable the embedded bootconfig.
Note that even if you set this option, you can override the embedded
bootconfig by another bootconfig which attached to the initrd.
Kernel parameters via Boot Config
=================================

View File

@@ -517,6 +517,18 @@ of ftrace. Here is a list of some of the key files:
processing should be able to handle them. See comments in the
ktime_get_boot_fast_ns() function for more information.
tai:
This is the tai clock (CLOCK_TAI) and is derived from the wall-
clock time. However, this clock does not experience
discontinuities and backwards jumps caused by NTP inserting leap
seconds. Since the clock access is designed for use in tracing,
side effects are possible. The clock access may yield wrong
readouts in case the internal TAI offset is updated e.g., caused
by setting the system time or using adjtimex() with an offset.
These effects are rare and post processing should be able to
handle them. See comments in the ktime_get_tai_fast_ns()
function for more information.
To set a clock, simply echo the clock name into this file::
# echo global > trace_clock

View File

@@ -74,8 +74,9 @@ directory. The timerlat configs are:
- stop_tracing_total_us: stop the system tracing if a
timer latency at the *thread* context is higher than the configured
value happens. Writing 0 disables this option.
- print_stack: save the stack of the IRQ occurrence, and print
it after the *thread context* event".
- print_stack: save the stack of the IRQ occurrence. The stack is printed
after the *thread context* event, or at the IRQ handler if *stop_tracing_us*
is hit.
timerlat and osnoise
----------------------------

View File

@@ -7517,6 +7517,7 @@ S: Maintained
F: Documentation/admin-guide/bootconfig.rst
F: fs/proc/bootconfig.c
F: include/linux/bootconfig.h
F: lib/bootconfig-data.S
F: lib/bootconfig.c
F: tools/bootconfig/*
F: tools/bootconfig/scripts/*
@@ -20119,8 +20120,8 @@ M: Ingo Molnar <mingo@redhat.com>
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
F: Documentation/trace/ftrace.rst
F: arch/*/*/*/ftrace.h
F: arch/*/kernel/ftrace.c
F: arch/*/*/*/*ftrace*
F: arch/*/*/*ftrace*
F: fs/tracefs/
F: include/*/ftrace.h
F: include/linux/trace*.h

View File

@@ -79,16 +79,14 @@ static unsigned long __ref adjust_address(struct dyn_ftrace *rec,
return (unsigned long)&ftrace_regs_caller_from_init;
}
int ftrace_arch_code_modify_prepare(void)
void ftrace_arch_code_modify_prepare(void)
{
return 0;
}
int ftrace_arch_code_modify_post_process(void)
void ftrace_arch_code_modify_post_process(void)
{
/* Make sure any TLB misses during machine stop are cleared. */
flush_tlb_all();
return 0;
}
static unsigned long ftrace_call_replace(unsigned long pc, unsigned long addr,

View File

@@ -12,16 +12,14 @@
#include <asm/patch.h>
#ifdef CONFIG_DYNAMIC_FTRACE
int ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
void ftrace_arch_code_modify_prepare(void) __acquires(&text_mutex)
{
mutex_lock(&text_mutex);
return 0;
}
int ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
void ftrace_arch_code_modify_post_process(void) __releases(&text_mutex)
{
mutex_unlock(&text_mutex);
return 0;
}
static int ftrace_check_current_call(unsigned long hook_pos,

View File

@@ -225,14 +225,13 @@ void arch_ftrace_update_code(int command)
ftrace_modify_all_code(command);
}
int ftrace_arch_code_modify_post_process(void)
void ftrace_arch_code_modify_post_process(void)
{
/*
* Flush any pre-fetched instructions on all
* CPUs to make the new code visible.
*/
text_poke_sync_lock();
return 0;
}
#ifdef CONFIG_MODULES

View File

@@ -9,6 +9,13 @@
# define MCOUNT_ADDR ((unsigned long)(__fentry__))
#define MCOUNT_INSN_SIZE 5 /* sizeof mcount call */
/* Ignore unused weak functions which will have non zero offsets */
#ifdef CONFIG_HAVE_FENTRY
# include <asm/ibt.h>
/* Add offset for endbr64 if IBT enabled */
# define FTRACE_MCOUNT_MAX_OFFSET ENDBR_INSN_SIZE
#endif
#ifdef CONFIG_DYNAMIC_FTRACE
#define ARCH_SUPPORTS_FTRACE_OPS 1
#endif

View File

@@ -37,7 +37,7 @@
static int ftrace_poke_late = 0;
int ftrace_arch_code_modify_prepare(void)
void ftrace_arch_code_modify_prepare(void)
__acquires(&text_mutex)
{
/*
@@ -47,10 +47,9 @@ int ftrace_arch_code_modify_prepare(void)
*/
mutex_lock(&text_mutex);
ftrace_poke_late = 1;
return 0;
}
int ftrace_arch_code_modify_post_process(void)
void ftrace_arch_code_modify_post_process(void)
__releases(&text_mutex)
{
/*
@@ -61,7 +60,6 @@ int ftrace_arch_code_modify_post_process(void)
text_poke_finish();
ftrace_poke_late = 0;
mutex_unlock(&text_mutex);
return 0;
}
static const char *ftrace_nop_replace(void)

View File

@@ -1,17 +1,11 @@
// SPDX-License-Identifier: GPL-2.0
/*
* Code for supporting irq vector tracepoints.
*
* Copyright (C) 2013 Seiji Aguchi <seiji.aguchi@hds.com>
*
*/
#include <linux/jump_label.h>
#include <linux/atomic.h>
#include <asm/hw_irq.h>
#include <asm/desc.h>
#include <asm/trace/exceptions.h>
#include <asm/trace/irq_vectors.h>
DEFINE_STATIC_KEY_FALSE(trace_pagefault_key);

View File

@@ -289,4 +289,14 @@ int __init xbc_get_info(int *node_size, size_t *data_size);
/* XBC cleanup data structures */
void __init xbc_exit(void);
/* XBC embedded bootconfig data in kernel */
#ifdef CONFIG_BOOT_CONFIG_EMBED
const char * __init xbc_get_embedded_bootconfig(size_t *size);
#else
static inline const char *xbc_get_embedded_bootconfig(size_t *size)
{
return NULL;
}
#endif
#endif

View File

@@ -452,8 +452,8 @@ static inline void stack_tracer_enable(void) { }
#ifdef CONFIG_DYNAMIC_FTRACE
int ftrace_arch_code_modify_prepare(void);
int ftrace_arch_code_modify_post_process(void);
void ftrace_arch_code_modify_prepare(void);
void ftrace_arch_code_modify_post_process(void);
enum ftrace_bug_type {
FTRACE_BUG_UNKNOWN,

View File

@@ -424,7 +424,7 @@ void unregister_kretprobe(struct kretprobe *rp);
int register_kretprobes(struct kretprobe **rps, int num);
void unregister_kretprobes(struct kretprobe **rps, int num);
#ifdef CONFIG_KRETPROBE_ON_RETHOOK
#if defined(CONFIG_KRETPROBE_ON_RETHOOK) || !defined(CONFIG_KRETPROBES)
#define kprobe_flush_task(tk) do {} while (0)
#else
void kprobe_flush_task(struct task_struct *tk);

View File

@@ -1338,7 +1338,7 @@ endif
config BOOT_CONFIG
bool "Boot config support"
select BLK_DEV_INITRD
select BLK_DEV_INITRD if !BOOT_CONFIG_EMBED
help
Extra boot config allows system admin to pass a config file as
complemental extension of kernel cmdline when booting.
@@ -1348,6 +1348,25 @@ config BOOT_CONFIG
If unsure, say Y.
config BOOT_CONFIG_EMBED
bool "Embed bootconfig file in the kernel"
depends on BOOT_CONFIG
help
Embed a bootconfig file given by BOOT_CONFIG_EMBED_FILE in the
kernel. Usually, the bootconfig file is loaded with the initrd
image. But if the system doesn't support initrd, this option will
help you by embedding a bootconfig file while building the kernel.
If unsure, say N.
config BOOT_CONFIG_EMBED_FILE
string "Embedded bootconfig file path"
depends on BOOT_CONFIG_EMBED
help
Specify a bootconfig file which will be embedded to the kernel.
This bootconfig will be used if there is no initrd or no other
bootconfig in the initrd.
config INITRAMFS_PRESERVE_MTIME
bool "Preserve cpio archive mtimes in initramfs"
default y

View File

@@ -266,7 +266,7 @@ static int __init loglevel(char *str)
early_param("loglevel", loglevel);
#ifdef CONFIG_BLK_DEV_INITRD
static void * __init get_boot_config_from_initrd(u32 *_size, u32 *_csum)
static void * __init get_boot_config_from_initrd(size_t *_size)
{
u32 size, csum;
char *data;
@@ -300,17 +300,20 @@ found:
return NULL;
}
if (xbc_calc_checksum(data, size) != csum) {
pr_err("bootconfig checksum failed\n");
return NULL;
}
/* Remove bootconfig from initramfs/initrd */
initrd_end = (unsigned long)data;
if (_size)
*_size = size;
if (_csum)
*_csum = csum;
return data;
}
#else
static void * __init get_boot_config_from_initrd(u32 *_size, u32 *_csum)
static void * __init get_boot_config_from_initrd(size_t *_size)
{
return NULL;
}
@@ -407,14 +410,16 @@ static int __init warn_bootconfig(char *str)
static void __init setup_boot_config(void)
{
static char tmp_cmdline[COMMAND_LINE_SIZE] __initdata;
const char *msg;
int pos;
u32 size, csum;
char *data, *err;
int ret;
const char *msg, *data;
int pos, ret;
size_t size;
char *err;
/* Cut out the bootconfig data even if we have no bootconfig option */
data = get_boot_config_from_initrd(&size, &csum);
data = get_boot_config_from_initrd(&size);
/* If there is no bootconfig in initrd, try embedded one. */
if (!data)
data = xbc_get_embedded_bootconfig(&size);
strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
err = parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
@@ -433,13 +438,8 @@ static void __init setup_boot_config(void)
}
if (size >= XBC_DATA_MAX) {
pr_err("bootconfig size %d greater than max size %d\n",
size, XBC_DATA_MAX);
return;
}
if (xbc_calc_checksum(data, size) != csum) {
pr_err("bootconfig checksum failed\n");
pr_err("bootconfig size %ld greater than max size %d\n",
(long)size, XBC_DATA_MAX);
return;
}
@@ -452,7 +452,7 @@ static void __init setup_boot_config(void)
msg, pos);
} else {
xbc_get_info(&ret, NULL);
pr_info("Load bootconfig: %d bytes %d nodes\n", size, ret);
pr_info("Load bootconfig: %ld bytes %d nodes\n", (long)size, ret);
/* keys starting with "kernel." are passed via cmdline */
extra_command_line = xbc_make_cmdline("kernel");
/* Also, "init." keys are init arguments */
@@ -471,7 +471,7 @@ static void __init exit_boot_config(void)
static void __init setup_boot_config(void)
{
/* Remove bootconfig data from initrd */
get_boot_config_from_initrd(NULL, NULL);
get_boot_config_from_initrd(NULL);
}
static int __init warn_bootconfig(char *str)

View File

@@ -1257,79 +1257,6 @@ void kprobe_busy_end(void)
preempt_enable();
}
#if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
static void free_rp_inst_rcu(struct rcu_head *head)
{
struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
if (refcount_dec_and_test(&ri->rph->ref))
kfree(ri->rph);
kfree(ri);
}
NOKPROBE_SYMBOL(free_rp_inst_rcu);
static void recycle_rp_inst(struct kretprobe_instance *ri)
{
struct kretprobe *rp = get_kretprobe(ri);
if (likely(rp))
freelist_add(&ri->freelist, &rp->freelist);
else
call_rcu(&ri->rcu, free_rp_inst_rcu);
}
NOKPROBE_SYMBOL(recycle_rp_inst);
/*
* This function is called from delayed_put_task_struct() when a task is
* dead and cleaned up to recycle any kretprobe instances associated with
* this task. These left over instances represent probed functions that
* have been called but will never return.
*/
void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
struct llist_node *node;
/* Early boot, not yet initialized. */
if (unlikely(!kprobes_initialized))
return;
kprobe_busy_begin();
node = __llist_del_all(&tk->kretprobe_instances);
while (node) {
ri = container_of(node, struct kretprobe_instance, llist);
node = node->next;
recycle_rp_inst(ri);
}
kprobe_busy_end();
}
NOKPROBE_SYMBOL(kprobe_flush_task);
static inline void free_rp_inst(struct kretprobe *rp)
{
struct kretprobe_instance *ri;
struct freelist_node *node;
int count = 0;
node = rp->freelist.head;
while (node) {
ri = container_of(node, struct kretprobe_instance, freelist);
node = node->next;
kfree(ri);
count++;
}
if (refcount_sub_and_test(count, &rp->rph->ref)) {
kfree(rp->rph);
rp->rph = NULL;
}
}
#endif /* !CONFIG_KRETPROBE_ON_RETHOOK */
/* Add the new probe to 'ap->list'. */
static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
{
@@ -1928,6 +1855,77 @@ static struct notifier_block kprobe_exceptions_nb = {
#ifdef CONFIG_KRETPROBES
#if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
static void free_rp_inst_rcu(struct rcu_head *head)
{
struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
if (refcount_dec_and_test(&ri->rph->ref))
kfree(ri->rph);
kfree(ri);
}
NOKPROBE_SYMBOL(free_rp_inst_rcu);
static void recycle_rp_inst(struct kretprobe_instance *ri)
{
struct kretprobe *rp = get_kretprobe(ri);
if (likely(rp))
freelist_add(&ri->freelist, &rp->freelist);
else
call_rcu(&ri->rcu, free_rp_inst_rcu);
}
NOKPROBE_SYMBOL(recycle_rp_inst);
/*
* This function is called from delayed_put_task_struct() when a task is
* dead and cleaned up to recycle any kretprobe instances associated with
* this task. These left over instances represent probed functions that
* have been called but will never return.
*/
void kprobe_flush_task(struct task_struct *tk)
{
struct kretprobe_instance *ri;
struct llist_node *node;
/* Early boot, not yet initialized. */
if (unlikely(!kprobes_initialized))
return;
kprobe_busy_begin();
node = __llist_del_all(&tk->kretprobe_instances);
while (node) {
ri = container_of(node, struct kretprobe_instance, llist);
node = node->next;
recycle_rp_inst(ri);
}
kprobe_busy_end();
}
NOKPROBE_SYMBOL(kprobe_flush_task);
static inline void free_rp_inst(struct kretprobe *rp)
{
struct kretprobe_instance *ri;
struct freelist_node *node;
int count = 0;
node = rp->freelist.head;
while (node) {
ri = container_of(node, struct kretprobe_instance, freelist);
node = node->next;
kfree(ri);
count++;
}
if (refcount_sub_and_test(count, &rp->rph->ref)) {
kfree(rp->rph);
rp->rph = NULL;
}
}
/* This assumes the 'tsk' is the current task or the is not running. */
static kprobe_opcode_t *__kretprobe_find_ret_addr(struct task_struct *tsk,
struct llist_node **cur)

View File

@@ -31,6 +31,10 @@ ifdef CONFIG_GCOV_PROFILE_FTRACE
GCOV_PROFILE := y
endif
# Functions in this file could be invoked from early interrupt
# code and produce random code coverage.
KCOV_INSTRUMENT_trace_preemptirq.o := n
CFLAGS_bpf_trace.o := -I$(src)
CFLAGS_trace_benchmark.o := -I$(src)

View File

@@ -45,6 +45,8 @@
#include "trace_output.h"
#include "trace_stat.h"
#define FTRACE_INVALID_FUNCTION "__ftrace_invalid_address__"
#define FTRACE_WARN_ON(cond) \
({ \
int ___r = cond; \
@@ -119,7 +121,7 @@ struct ftrace_ops __rcu *ftrace_ops_list __read_mostly = &ftrace_list_end;
ftrace_func_t ftrace_trace_function __read_mostly = ftrace_stub;
struct ftrace_ops global_ops;
/* Defined by vmlinux.lds.h see the commment above arch_ftrace_ops_list_func for details */
/* Defined by vmlinux.lds.h see the comment above arch_ftrace_ops_list_func for details */
void ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs);
@@ -952,7 +954,6 @@ static struct tracer_stat function_stats __initdata = {
static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
{
struct ftrace_profile_stat *stat;
struct dentry *entry;
char *name;
int ret;
int cpu;
@@ -983,11 +984,9 @@ static __init void ftrace_profile_tracefs(struct dentry *d_tracer)
}
}
entry = tracefs_create_file("function_profile_enabled",
TRACE_MODE_WRITE, d_tracer, NULL,
&ftrace_profile_fops);
if (!entry)
pr_warn("Could not create tracefs 'function_profile_enabled' entry\n");
trace_create_file("function_profile_enabled",
TRACE_MODE_WRITE, d_tracer, NULL,
&ftrace_profile_fops);
}
#else /* CONFIG_FUNCTION_PROFILER */
@@ -2707,18 +2706,16 @@ ftrace_nop_initialize(struct module *mod, struct dyn_ftrace *rec)
* archs can override this function if they must do something
* before the modifying code is performed.
*/
int __weak ftrace_arch_code_modify_prepare(void)
void __weak ftrace_arch_code_modify_prepare(void)
{
return 0;
}
/*
* archs can override this function if they must do something
* after the modifying code is performed.
*/
int __weak ftrace_arch_code_modify_post_process(void)
void __weak ftrace_arch_code_modify_post_process(void)
{
return 0;
}
void ftrace_modify_all_code(int command)
@@ -2804,12 +2801,7 @@ void __weak arch_ftrace_update_code(int command)
static void ftrace_run_update_code(int command)
{
int ret;
ret = ftrace_arch_code_modify_prepare();
FTRACE_WARN_ON(ret);
if (ret)
return;
ftrace_arch_code_modify_prepare();
/*
* By default we use stop_machine() to modify the code.
@@ -2819,8 +2811,7 @@ static void ftrace_run_update_code(int command)
*/
arch_ftrace_update_code(command);
ret = ftrace_arch_code_modify_post_process();
FTRACE_WARN_ON(ret);
ftrace_arch_code_modify_post_process();
}
static void ftrace_run_modify_code(struct ftrace_ops *ops, int command,
@@ -3631,6 +3622,105 @@ static void add_trampoline_func(struct seq_file *m, struct ftrace_ops *ops,
seq_printf(m, " ->%pS", ptr);
}
#ifdef FTRACE_MCOUNT_MAX_OFFSET
/*
* Weak functions can still have an mcount/fentry that is saved in
* the __mcount_loc section. These can be detected by having a
* symbol offset of greater than FTRACE_MCOUNT_MAX_OFFSET, as the
* symbol found by kallsyms is not the function that the mcount/fentry
* is part of. The offset is much greater in these cases.
*
* Test the record to make sure that the ip points to a valid kallsyms
* and if not, mark it disabled.
*/
static int test_for_valid_rec(struct dyn_ftrace *rec)
{
char str[KSYM_SYMBOL_LEN];
unsigned long offset;
const char *ret;
ret = kallsyms_lookup(rec->ip, NULL, &offset, NULL, str);
/* Weak functions can cause invalid addresses */
if (!ret || offset > FTRACE_MCOUNT_MAX_OFFSET) {
rec->flags |= FTRACE_FL_DISABLED;
return 0;
}
return 1;
}
static struct workqueue_struct *ftrace_check_wq __initdata;
static struct work_struct ftrace_check_work __initdata;
/*
* Scan all the mcount/fentry entries to make sure they are valid.
*/
static __init void ftrace_check_work_func(struct work_struct *work)
{
struct ftrace_page *pg;
struct dyn_ftrace *rec;
mutex_lock(&ftrace_lock);
do_for_each_ftrace_rec(pg, rec) {
test_for_valid_rec(rec);
} while_for_each_ftrace_rec();
mutex_unlock(&ftrace_lock);
}
static int __init ftrace_check_for_weak_functions(void)
{
INIT_WORK(&ftrace_check_work, ftrace_check_work_func);
ftrace_check_wq = alloc_workqueue("ftrace_check_wq", WQ_UNBOUND, 0);
queue_work(ftrace_check_wq, &ftrace_check_work);
return 0;
}
static int __init ftrace_check_sync(void)
{
/* Make sure the ftrace_check updates are finished */
if (ftrace_check_wq)
destroy_workqueue(ftrace_check_wq);
return 0;
}
late_initcall_sync(ftrace_check_sync);
subsys_initcall(ftrace_check_for_weak_functions);
static int print_rec(struct seq_file *m, unsigned long ip)
{
unsigned long offset;
char str[KSYM_SYMBOL_LEN];
char *modname;
const char *ret;
ret = kallsyms_lookup(ip, NULL, &offset, &modname, str);
/* Weak functions can cause invalid addresses */
if (!ret || offset > FTRACE_MCOUNT_MAX_OFFSET) {
snprintf(str, KSYM_SYMBOL_LEN, "%s_%ld",
FTRACE_INVALID_FUNCTION, offset);
ret = NULL;
}
seq_puts(m, str);
if (modname)
seq_printf(m, " [%s]", modname);
return ret == NULL ? -1 : 0;
}
#else
static inline int test_for_valid_rec(struct dyn_ftrace *rec)
{
return 1;
}
static inline int print_rec(struct seq_file *m, unsigned long ip)
{
seq_printf(m, "%ps", (void *)ip);
return 0;
}
#endif
static int t_show(struct seq_file *m, void *v)
{
struct ftrace_iterator *iter = m->private;
@@ -3655,7 +3745,13 @@ static int t_show(struct seq_file *m, void *v)
if (!rec)
return 0;
seq_printf(m, "%ps", (void *)rec->ip);
if (print_rec(m, rec->ip)) {
/* This should only happen when a rec is disabled */
WARN_ON_ONCE(!(rec->flags & FTRACE_FL_DISABLED));
seq_putc(m, '\n');
return 0;
}
if (iter->flags & FTRACE_ITER_ENABLED) {
struct ftrace_ops *ops;
@@ -3973,6 +4069,24 @@ add_rec_by_index(struct ftrace_hash *hash, struct ftrace_glob *func_g,
return 0;
}
#ifdef FTRACE_MCOUNT_MAX_OFFSET
static int lookup_ip(unsigned long ip, char **modname, char *str)
{
unsigned long offset;
kallsyms_lookup(ip, NULL, &offset, modname, str);
if (offset > FTRACE_MCOUNT_MAX_OFFSET)
return -1;
return 0;
}
#else
static int lookup_ip(unsigned long ip, char **modname, char *str)
{
kallsyms_lookup(ip, NULL, NULL, modname, str);
return 0;
}
#endif
static int
ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g,
struct ftrace_glob *mod_g, int exclude_mod)
@@ -3980,7 +4094,12 @@ ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g,
char str[KSYM_SYMBOL_LEN];
char *modname;
kallsyms_lookup(rec->ip, NULL, NULL, &modname, str);
if (lookup_ip(rec->ip, &modname, str)) {
/* This should only happen when a rec is disabled */
WARN_ON_ONCE(system_state == SYSTEM_RUNNING &&
!(rec->flags & FTRACE_FL_DISABLED));
return 0;
}
if (mod_g) {
int mod_matches = (modname) ? ftrace_match(modname, mod_g) : 0;
@@ -4431,7 +4550,7 @@ int ftrace_func_mapper_add_ip(struct ftrace_func_mapper *mapper,
* @ip: The instruction pointer address to remove the data from
*
* Returns the data if it is found, otherwise NULL.
* Note, if the data pointer is used as the data itself, (see
* Note, if the data pointer is used as the data itself, (see
* ftrace_func_mapper_find_ip(), then the return value may be meaningless,
* if the data pointer was set to zero.
*/
@@ -4526,8 +4645,8 @@ register_ftrace_function_probe(char *glob, struct trace_array *tr,
struct ftrace_probe_ops *probe_ops,
void *data)
{
struct ftrace_func_probe *probe = NULL, *iter;
struct ftrace_func_entry *entry;
struct ftrace_func_probe *probe;
struct ftrace_hash **orig_hash;
struct ftrace_hash *old_hash;
struct ftrace_hash *hash;
@@ -4546,11 +4665,13 @@ register_ftrace_function_probe(char *glob, struct trace_array *tr,
mutex_lock(&ftrace_lock);
/* Check if the probe_ops is already registered */
list_for_each_entry(probe, &tr->func_probes, list) {
if (probe->probe_ops == probe_ops)
list_for_each_entry(iter, &tr->func_probes, list) {
if (iter->probe_ops == probe_ops) {
probe = iter;
break;
}
}
if (&probe->list == &tr->func_probes) {
if (!probe) {
probe = kzalloc(sizeof(*probe), GFP_KERNEL);
if (!probe) {
mutex_unlock(&ftrace_lock);
@@ -4668,9 +4789,9 @@ int
unregister_ftrace_function_probe_func(char *glob, struct trace_array *tr,
struct ftrace_probe_ops *probe_ops)
{
struct ftrace_func_probe *probe = NULL, *iter;
struct ftrace_ops_hash old_hash_ops;
struct ftrace_func_entry *entry;
struct ftrace_func_probe *probe;
struct ftrace_glob func_g;
struct ftrace_hash **orig_hash;
struct ftrace_hash *old_hash;
@@ -4698,11 +4819,13 @@ unregister_ftrace_function_probe_func(char *glob, struct trace_array *tr,
mutex_lock(&ftrace_lock);
/* Check if the probe_ops is already registered */
list_for_each_entry(probe, &tr->func_probes, list) {
if (probe->probe_ops == probe_ops)
list_for_each_entry(iter, &tr->func_probes, list) {
if (iter->probe_ops == probe_ops) {
probe = iter;
break;
}
}
if (&probe->list == &tr->func_probes)
if (!probe)
goto err_unlock_ftrace;
ret = -EINVAL;
@@ -5161,8 +5284,6 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
goto out_unlock;
ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
if (ret)
remove_hash_entry(direct_functions, entry);
if (!ret && !(direct_ops.flags & FTRACE_OPS_FL_ENABLED)) {
ret = register_ftrace_function(&direct_ops);
@@ -5171,6 +5292,7 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
}
if (ret) {
remove_hash_entry(direct_functions, entry);
kfree(entry);
if (!direct->count) {
list_del_rcu(&direct->next);
@@ -6793,6 +6915,13 @@ void ftrace_module_enable(struct module *mod)
!within_module_init(rec->ip, mod))
break;
/* Weak functions should still be ignored */
if (!test_for_valid_rec(rec)) {
/* Clear all other flags. Should not be enabled anyway */
rec->flags = FTRACE_FL_DISABLED;
continue;
}
cnt = 0;
/*
@@ -6829,11 +6958,16 @@ void ftrace_module_enable(struct module *mod)
void ftrace_module_init(struct module *mod)
{
int ret;
if (ftrace_disabled || !mod->num_ftrace_callsites)
return;
ftrace_process_locs(mod, mod->ftrace_callsites,
mod->ftrace_callsites + mod->num_ftrace_callsites);
ret = ftrace_process_locs(mod, mod->ftrace_callsites,
mod->ftrace_callsites + mod->num_ftrace_callsites);
if (ret)
pr_warn("ftrace: failed to allocate entries for module '%s' functions\n",
mod->name);
}
static void save_ftrace_mod_rec(struct ftrace_mod_map *mod_map,
@@ -7166,15 +7300,19 @@ void __init ftrace_init(void)
pr_info("ftrace: allocating %ld entries in %ld pages\n",
count, count / ENTRIES_PER_PAGE + 1);
last_ftrace_enabled = ftrace_enabled = 1;
ret = ftrace_process_locs(NULL,
__start_mcount_loc,
__stop_mcount_loc);
if (ret) {
pr_warn("ftrace: failed to allocate entries for functions\n");
goto failed;
}
pr_info("ftrace: allocated %ld pages with %ld groups\n",
ftrace_number_of_pages, ftrace_number_of_groups);
last_ftrace_enabled = ftrace_enabled = 1;
set_ftrace_early_filters();
return;

View File

@@ -118,9 +118,9 @@ static inline unsigned int pid_join(unsigned int upper1,
/**
* trace_pid_list_is_set - test if the pid is set in the list
* @pid_list: The pid list to test
* @pid: The pid to to see if set in the list.
* @pid: The pid to see if set in the list.
*
* Tests if @pid is is set in the @pid_list. This is usually called
* Tests if @pid is set in the @pid_list. This is usually called
* from the scheduler when a task is scheduled. Its pid is checked
* if it should be traced or not.
*

View File

@@ -29,6 +29,14 @@
#include <asm/local.h>
/*
* The "absolute" timestamp in the buffer is only 59 bits.
* If a clock has the 5 MSBs set, it needs to be saved and
* reinserted.
*/
#define TS_MSB (0xf8ULL << 56)
#define ABS_TS_MASK (~TS_MSB)
static void update_pages_handler(struct work_struct *work);
/*
@@ -468,6 +476,7 @@ struct rb_time_struct {
local_t cnt;
local_t top;
local_t bottom;
local_t msb;
};
#else
#include <asm/local64.h>
@@ -569,7 +578,6 @@ struct ring_buffer_iter {
* For the ring buffer, 64 bit required operations for the time is
* the following:
*
* - Only need 59 bits (uses 60 to make it even).
* - Reads may fail if it interrupted a modification of the time stamp.
* It will succeed if it did not interrupt another write even if
* the read itself is interrupted by a write.
@@ -594,6 +602,7 @@ struct ring_buffer_iter {
*/
#define RB_TIME_SHIFT 30
#define RB_TIME_VAL_MASK ((1 << RB_TIME_SHIFT) - 1)
#define RB_TIME_MSB_SHIFT 60
static inline int rb_time_cnt(unsigned long val)
{
@@ -613,7 +622,7 @@ static inline u64 rb_time_val(unsigned long top, unsigned long bottom)
static inline bool __rb_time_read(rb_time_t *t, u64 *ret, unsigned long *cnt)
{
unsigned long top, bottom;
unsigned long top, bottom, msb;
unsigned long c;
/*
@@ -625,6 +634,7 @@ static inline bool __rb_time_read(rb_time_t *t, u64 *ret, unsigned long *cnt)
c = local_read(&t->cnt);
top = local_read(&t->top);
bottom = local_read(&t->bottom);
msb = local_read(&t->msb);
} while (c != local_read(&t->cnt));
*cnt = rb_time_cnt(top);
@@ -633,7 +643,8 @@ static inline bool __rb_time_read(rb_time_t *t, u64 *ret, unsigned long *cnt)
if (*cnt != rb_time_cnt(bottom))
return false;
*ret = rb_time_val(top, bottom);
/* The shift to msb will lose its cnt bits */
*ret = rb_time_val(top, bottom) | ((u64)msb << RB_TIME_MSB_SHIFT);
return true;
}
@@ -649,10 +660,12 @@ static inline unsigned long rb_time_val_cnt(unsigned long val, unsigned long cnt
return (val & RB_TIME_VAL_MASK) | ((cnt & 3) << RB_TIME_SHIFT);
}
static inline void rb_time_split(u64 val, unsigned long *top, unsigned long *bottom)
static inline void rb_time_split(u64 val, unsigned long *top, unsigned long *bottom,
unsigned long *msb)
{
*top = (unsigned long)((val >> RB_TIME_SHIFT) & RB_TIME_VAL_MASK);
*bottom = (unsigned long)(val & RB_TIME_VAL_MASK);
*msb = (unsigned long)(val >> RB_TIME_MSB_SHIFT);
}
static inline void rb_time_val_set(local_t *t, unsigned long val, unsigned long cnt)
@@ -663,15 +676,16 @@ static inline void rb_time_val_set(local_t *t, unsigned long val, unsigned long
static void rb_time_set(rb_time_t *t, u64 val)
{
unsigned long cnt, top, bottom;
unsigned long cnt, top, bottom, msb;
rb_time_split(val, &top, &bottom);
rb_time_split(val, &top, &bottom, &msb);
/* Writes always succeed with a valid number even if it gets interrupted. */
do {
cnt = local_inc_return(&t->cnt);
rb_time_val_set(&t->top, top, cnt);
rb_time_val_set(&t->bottom, bottom, cnt);
rb_time_val_set(&t->msb, val >> RB_TIME_MSB_SHIFT, cnt);
} while (cnt != local_read(&t->cnt));
}
@@ -686,8 +700,8 @@ rb_time_read_cmpxchg(local_t *l, unsigned long expect, unsigned long set)
static int rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
{
unsigned long cnt, top, bottom;
unsigned long cnt2, top2, bottom2;
unsigned long cnt, top, bottom, msb;
unsigned long cnt2, top2, bottom2, msb2;
u64 val;
/* The cmpxchg always fails if it interrupted an update */
@@ -703,16 +717,18 @@ static int rb_time_cmpxchg(rb_time_t *t, u64 expect, u64 set)
cnt2 = cnt + 1;
rb_time_split(val, &top, &bottom);
rb_time_split(val, &top, &bottom, &msb);
top = rb_time_val_cnt(top, cnt);
bottom = rb_time_val_cnt(bottom, cnt);
rb_time_split(set, &top2, &bottom2);
rb_time_split(set, &top2, &bottom2, &msb2);
top2 = rb_time_val_cnt(top2, cnt2);
bottom2 = rb_time_val_cnt(bottom2, cnt2);
if (!rb_time_read_cmpxchg(&t->cnt, cnt, cnt2))
return false;
if (!rb_time_read_cmpxchg(&t->msb, msb, msb2))
return false;
if (!rb_time_read_cmpxchg(&t->top, top, top2))
return false;
if (!rb_time_read_cmpxchg(&t->bottom, bottom, bottom2))
@@ -783,6 +799,24 @@ static inline void verify_event(struct ring_buffer_per_cpu *cpu_buffer,
}
#endif
/*
* The absolute time stamp drops the 5 MSBs and some clocks may
* require them. The rb_fix_abs_ts() will take a previous full
* time stamp, and add the 5 MSB of that time stamp on to the
* saved absolute time stamp. Then they are compared in case of
* the unlikely event that the latest time stamp incremented
* the 5 MSB.
*/
static inline u64 rb_fix_abs_ts(u64 abs, u64 save_ts)
{
if (save_ts & TS_MSB) {
abs |= save_ts & TS_MSB;
/* Check for overflow */
if (unlikely(abs < save_ts))
abs += 1ULL << 59;
}
return abs;
}
static inline u64 rb_time_stamp(struct trace_buffer *buffer);
@@ -811,8 +845,10 @@ u64 ring_buffer_event_time_stamp(struct trace_buffer *buffer,
u64 ts;
/* If the event includes an absolute time, then just use that */
if (event->type_len == RINGBUF_TYPE_TIME_STAMP)
return rb_event_time_stamp(event);
if (event->type_len == RINGBUF_TYPE_TIME_STAMP) {
ts = rb_event_time_stamp(event);
return rb_fix_abs_ts(ts, cpu_buffer->tail_page->page->time_stamp);
}
nest = local_read(&cpu_buffer->committing);
verify_event(cpu_buffer, event);
@@ -2754,8 +2790,15 @@ static void rb_add_timestamp(struct ring_buffer_per_cpu *cpu_buffer,
(RB_ADD_STAMP_FORCE | RB_ADD_STAMP_ABSOLUTE);
if (unlikely(info->delta > (1ULL << 59))) {
/*
* Some timers can use more than 59 bits, and when a timestamp
* is added to the buffer, it will lose those bits.
*/
if (abs && (info->ts & TS_MSB)) {
info->delta &= ABS_TS_MASK;
/* did the clock go backwards */
if (info->before == info->after && info->before > info->ts) {
} else if (info->before == info->after && info->before > info->ts) {
/* not interrupted */
static int once;
@@ -3304,7 +3347,7 @@ static void dump_buffer_page(struct buffer_data_page *bpage,
case RINGBUF_TYPE_TIME_STAMP:
delta = rb_event_time_stamp(event);
ts = delta;
ts = rb_fix_abs_ts(delta, ts);
pr_warn(" [%lld] absolute:%lld TIME STAMP\n", ts, delta);
break;
@@ -3380,7 +3423,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
case RINGBUF_TYPE_TIME_STAMP:
delta = rb_event_time_stamp(event);
ts = delta;
ts = rb_fix_abs_ts(delta, ts);
break;
case RINGBUF_TYPE_PADDING:
@@ -4367,6 +4410,7 @@ rb_update_read_stamp(struct ring_buffer_per_cpu *cpu_buffer,
case RINGBUF_TYPE_TIME_STAMP:
delta = rb_event_time_stamp(event);
delta = rb_fix_abs_ts(delta, cpu_buffer->read_stamp);
cpu_buffer->read_stamp = delta;
return;
@@ -4397,6 +4441,7 @@ rb_update_iter_read_stamp(struct ring_buffer_iter *iter,
case RINGBUF_TYPE_TIME_STAMP:
delta = rb_event_time_stamp(event);
delta = rb_fix_abs_ts(delta, iter->read_stamp);
iter->read_stamp = delta;
return;
@@ -4650,6 +4695,7 @@ rb_buffer_peek(struct ring_buffer_per_cpu *cpu_buffer, u64 *ts,
case RINGBUF_TYPE_TIME_STAMP:
if (ts) {
*ts = rb_event_time_stamp(event);
*ts = rb_fix_abs_ts(*ts, reader->page->time_stamp);
ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
cpu_buffer->cpu, ts);
}
@@ -4741,6 +4787,7 @@ rb_iter_peek(struct ring_buffer_iter *iter, u64 *ts)
case RINGBUF_TYPE_TIME_STAMP:
if (ts) {
*ts = rb_event_time_stamp(event);
*ts = rb_fix_abs_ts(*ts, iter->head_page->page->time_stamp);
ring_buffer_normalize_time_stamp(cpu_buffer->buffer,
cpu_buffer->cpu, ts);
}
@@ -6011,10 +6058,10 @@ static __init int test_ringbuffer(void)
pr_info(" total events: %ld\n", total_lost + total_read);
pr_info(" recorded len bytes: %ld\n", total_len);
pr_info(" recorded size bytes: %ld\n", total_size);
if (total_lost)
if (total_lost) {
pr_info(" With dropped events, record len and size may not match\n"
" alloced and written from above\n");
if (!total_lost) {
} else {
if (RB_WARN_ON(buffer, total_len != total_alloc ||
total_size != total_written))
break;

Some files were not shown because too many files have changed in this diff Show More