Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "The main changes in this cycle were:

  Kernel:

   - kprobes updates: use better W^X patterns for code modifications,
     improve optprobes, remove jprobes. (Masami Hiramatsu, Kees Cook)

   - core fixes: event timekeeping (enabled/running times statistics)
     fixes, perf_event_read() locking fixes and cleanups, etc. (Peter
     Zijlstra)

   - Extend x86 Intel free-running PEBS support and support x86
     user-register sampling in perf record and perf script. (Andi Kleen)

  Tooling:

   - Completely rework the way inline frames are handled. Instead of
     querying for the inline nodes on-demand in the individual tools, we
     now create proper callchain nodes for inlined frames. (Milian
     Wolff)

   - 'perf trace' updates (Arnaldo Carvalho de Melo)

   - Implement a way to print formatted output to per-event files in
     'perf script' to facilitate generate flamegraphs, elliminating the
     need to write scripts to do that separation (yuzhoujian, Arnaldo
     Carvalho de Melo)

   - Update vendor events JSON metrics for Intel's Broadwell, Broadwell
     Server, Haswell, Haswell Server, IvyBridge, IvyTown, JakeTown,
     Sandy Bridge, Skylake, SkyLake Server - and Goldmont Plus V1 (Andi
     Kleen, Kan Liang)

   - Multithread the synthesizing of PERF_RECORD_ events for
     pre-existing threads in 'perf top', speeding up that phase, greatly
     improving the user experience in systems such as Intel's Knights
     Mill (Kan Liang)

   - Introduce the concept of weak groups in 'perf stat': try to set up
     a group, but if it's not schedulable fallback to not using a group.
     That gives us the best of both worlds: groups if they work, but
     still a usable fallback if they don't. E.g: (Andi Kleen)

   - perf sched timehist enhancements (David Ahern)

   - ... various other enhancements, updates, cleanups and fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (139 commits)
  kprobes: Don't spam the build log with deprecation warnings
  arm/kprobes: Remove jprobe test case
  arm/kprobes: Fix kretprobe test to check correct counter
  perf srcline: Show correct function name for srcline of callchains
  perf srcline: Fix memory leak in addr2inlines()
  perf trace beauty kcmp: Beautify arguments
  perf trace beauty: Implement pid_fd beautifier
  tools include uapi: Grab a copy of linux/kcmp.h
  perf callchain: Fix double mapping al->addr for children without self period
  perf stat: Make --per-thread update shadow stats to show metrics
  perf stat: Move the shadow stats scale computation in perf_stat__update_shadow_stats
  perf tools: Add perf_data_file__write function
  perf tools: Add struct perf_data_file
  perf tools: Rename struct perf_data_file to perf_data
  perf script: Print information about per-event-dump files
  perf trace beauty prctl: Generate 'option' string table from kernel headers
  tools include uapi: Grab a copy of linux/prctl.h
  perf script: Allow creating per-event dump files
  perf evsel: Restore evsel->priv as a tool private area
  perf script: Use event_format__fprintf()
  ...
This commit is contained in:
Linus Torvalds
2017-11-13 13:05:08 -08:00
182 changed files with 8389 additions and 2540 deletions
+57 -102
View File
@@ -8,7 +8,7 @@ Kernel Probes (Kprobes)
.. CONTENTS
1. Concepts: Kprobes, Jprobes, Return Probes
1. Concepts: Kprobes, and Return Probes
2. Architectures Supported
3. Configuring Kprobes
4. API Reference
@@ -16,12 +16,12 @@ Kernel Probes (Kprobes)
6. Probe Overhead
7. TODO
8. Kprobes Example
9. Jprobes Example
10. Kretprobes Example
9. Kretprobes Example
10. Deprecated Features
Appendix A: The kprobes debugfs interface
Appendix B: The kprobes sysctl interface
Concepts: Kprobes, Jprobes, Return Probes
Concepts: Kprobes and Return Probes
=========================================
Kprobes enables you to dynamically break into any kernel routine and
@@ -32,12 +32,10 @@ routine to be invoked when the breakpoint is hit.
.. [1] some parts of the kernel code can not be trapped, see
:ref:`kprobes_blacklist`)
There are currently three types of probes: kprobes, jprobes, and
kretprobes (also called return probes). A kprobe can be inserted
on virtually any instruction in the kernel. A jprobe is inserted at
the entry to a kernel function, and provides convenient access to the
function's arguments. A return probe fires when a specified function
returns.
There are currently two types of probes: kprobes, and kretprobes
(also called return probes). A kprobe can be inserted on virtually
any instruction in the kernel. A return probe fires when a specified
function returns.
In the typical case, Kprobes-based instrumentation is packaged as
a kernel module. The module's init function installs ("registers")
@@ -82,45 +80,6 @@ After the instruction is single-stepped, Kprobes executes the
"post_handler," if any, that is associated with the kprobe.
Execution then continues with the instruction following the probepoint.
How Does a Jprobe Work?
-----------------------
A jprobe is implemented using a kprobe that is placed on a function's
entry point. It employs a simple mirroring principle to allow
seamless access to the probed function's arguments. The jprobe
handler routine should have the same signature (arg list and return
type) as the function being probed, and must always end by calling
the Kprobes function jprobe_return().
Here's how it works. When the probe is hit, Kprobes makes a copy of
the saved registers and a generous portion of the stack (see below).
Kprobes then points the saved instruction pointer at the jprobe's
handler routine, and returns from the trap. As a result, control
passes to the handler, which is presented with the same register and
stack contents as the probed function. When it is done, the handler
calls jprobe_return(), which traps again to restore the original stack
contents and processor state and switch to the probed function.
By convention, the callee owns its arguments, so gcc may produce code
that unexpectedly modifies that portion of the stack. This is why
Kprobes saves a copy of the stack and restores it after the jprobe
handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g.,
64 bytes on i386.
Note that the probed function's args may be passed on the stack
or in registers. The jprobe will work in either case, so long as the
handler's prototype matches that of the probed function.
Note that in some architectures (e.g.: arm64 and sparc64) the stack
copy is not done, as the actual location of stacked parameters may be
outside of a reasonable MAX_STACK_SIZE value and because that location
cannot be determined by the jprobes code. In this case the jprobes
user must be careful to make certain the calling signature of the
function does not cause parameters to be passed on the stack (e.g.:
more than eight function arguments, an argument of more than sixteen
bytes, or more than 64 bytes of argument data, depending on
architecture).
Return Probes
-------------
@@ -245,8 +204,7 @@ Pre-optimization
After preparing the detour buffer, Kprobes verifies that none of the
following situations exist:
- The probe has either a break_handler (i.e., it's a jprobe) or a
post_handler.
- The probe has a post_handler.
- Other instructions in the optimized region are probed.
- The probe is disabled.
@@ -331,7 +289,7 @@ rejects registering it, if the given address is in the blacklist.
Architectures Supported
=======================
Kprobes, jprobes, and return probes are implemented on the following
Kprobes and return probes are implemented on the following
architectures:
- i386 (Supports jump optimization)
@@ -446,27 +404,6 @@ architecture-specific trap number associated with the fault (e.g.,
on i386, 13 for a general protection fault or 14 for a page fault).
Returns 1 if it successfully handled the exception.
register_jprobe
---------------
::
#include <linux/kprobes.h>
int register_jprobe(struct jprobe *jp)
Sets a breakpoint at the address jp->kp.addr, which must be the address
of the first instruction of a function. When the breakpoint is hit,
Kprobes runs the handler whose address is jp->entry.
The handler should have the same arg list and return type as the probed
function; and just before it returns, it must call jprobe_return().
(The handler never actually returns, since jprobe_return() returns
control to Kprobes.) If the probed function is declared asmlinkage
or anything else that affects how args are passed, the handler's
declaration must match.
register_jprobe() returns 0 on success, or a negative errno otherwise.
register_kretprobe
------------------
@@ -513,7 +450,6 @@ unregister_*probe
#include <linux/kprobes.h>
void unregister_kprobe(struct kprobe *kp);
void unregister_jprobe(struct jprobe *jp);
void unregister_kretprobe(struct kretprobe *rp);
Removes the specified probe. The unregister function can be called
@@ -532,7 +468,6 @@ register_*probes
#include <linux/kprobes.h>
int register_kprobes(struct kprobe **kps, int num);
int register_kretprobes(struct kretprobe **rps, int num);
int register_jprobes(struct jprobe **jps, int num);
Registers each of the num probes in the specified array. If any
error occurs during registration, all probes in the array, up to
@@ -555,7 +490,6 @@ unregister_*probes
#include <linux/kprobes.h>
void unregister_kprobes(struct kprobe **kps, int num);
void unregister_kretprobes(struct kretprobe **rps, int num);
void unregister_jprobes(struct jprobe **jps, int num);
Removes each of the num probes in the specified array at once.
@@ -574,7 +508,6 @@ disable_*probe
#include <linux/kprobes.h>
int disable_kprobe(struct kprobe *kp);
int disable_kretprobe(struct kretprobe *rp);
int disable_jprobe(struct jprobe *jp);
Temporarily disables the specified ``*probe``. You can enable it again by using
enable_*probe(). You must specify the probe which has been registered.
@@ -587,7 +520,6 @@ enable_*probe
#include <linux/kprobes.h>
int enable_kprobe(struct kprobe *kp);
int enable_kretprobe(struct kretprobe *rp);
int enable_jprobe(struct jprobe *jp);
Enables ``*probe`` which has been disabled by disable_*probe(). You must specify
the probe which has been registered.
@@ -595,12 +527,10 @@ the probe which has been registered.
Kprobes Features and Limitations
================================
Kprobes allows multiple probes at the same address. Currently,
however, there cannot be multiple jprobes on the same function at
the same time. Also, a probepoint for which there is a jprobe or
a post_handler cannot be optimized. So if you install a jprobe,
or a kprobe with a post_handler, at an optimized probepoint, the
probepoint will be unoptimized automatically.
Kprobes allows multiple probes at the same address. Also,
a probepoint for which there is a post_handler cannot be optimized.
So if you install a kprobe with a post_handler, at an optimized
probepoint, the probepoint will be unoptimized automatically.
In general, you can install a probe anywhere in the kernel.
In particular, you can probe interrupt handlers. Known exceptions
@@ -662,7 +592,7 @@ We're unaware of other specific cases where this could be a problem.
If, upon entry to or exit from a function, the CPU is running on
a stack other than that of the current task, registering a return
probe on that function may produce undesirable results. For this
reason, Kprobes doesn't support return probes (or kprobes or jprobes)
reason, Kprobes doesn't support return probes (or kprobes)
on the x86_64 version of __switch_to(); the registration functions
return -EINVAL.
@@ -706,24 +636,24 @@ Probe Overhead
On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
microseconds to process. Specifically, a benchmark that hits the same
probepoint repeatedly, firing a simple handler each time, reports 1-2
million hits per second, depending on the architecture. A jprobe or
return-probe hit typically takes 50-75% longer than a kprobe hit.
million hits per second, depending on the architecture. A return-probe
hit typically takes 50-75% longer than a kprobe hit.
When you have a return probe set on a function, adding a kprobe at
the entry to that function adds essentially no overhead.
Here are sample overhead figures (in usec) for different architectures::
k = kprobe; j = jprobe; r = return probe; kr = kprobe + return probe
on same function; jr = jprobe + return probe on same function::
k = kprobe; r = return probe; kr = kprobe + return probe
on same function
i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips
k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40
k = 0.57 usec; r = 0.92; kr = 0.99
x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips
k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07
k = 0.49 usec; r = 0.80; kr = 0.82
ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU)
k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99
k = 0.77 usec; r = 1.26; kr = 1.45
Optimized Probe Overhead
------------------------
@@ -755,11 +685,6 @@ Kprobes Example
See samples/kprobes/kprobe_example.c
Jprobes Example
===============
See samples/kprobes/jprobe_example.c
Kretprobes Example
==================
@@ -772,6 +697,37 @@ For additional information on Kprobes, refer to the following URLs:
- http://www-users.cs.umn.edu/~boutcher/kprobes/
- http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
Deprecated Features
===================
Jprobes is now a deprecated feature. People who are depending on it should
migrate to other tracing features or use older kernels. Please consider to
migrate your tool to one of the following options:
- Use trace-event to trace target function with arguments.
trace-event is a low-overhead (and almost no visible overhead if it
is off) statically defined event interface. You can define new events
and trace it via ftrace or any other tracing tools.
See the following urls:
- https://lwn.net/Articles/379903/
- https://lwn.net/Articles/381064/
- https://lwn.net/Articles/383362/
- Use ftrace dynamic events (kprobe event) with perf-probe.
If you build your kernel with debug info (CONFIG_DEBUG_INFO=y), you can
find which register/stack is assigned to which local variable or arguments
by using perf-probe and set up new event to trace it.
See following documents:
- Documentation/trace/kprobetrace.txt
- Documentation/trace/events.txt
- tools/perf/Documentation/perf-probe.txt
The kprobes debugfs interface
=============================
@@ -783,14 +739,13 @@ under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at /
/sys/kernel/debug/kprobes/list: Lists all registered probes on the system::
c015d71a k vfs_read+0x0
c011a316 j do_fork+0x0
c03dedc5 r tcp_v4_rcv+0x0
The first column provides the kernel address where the probe is inserted.
The second column identifies the type of probe (k - kprobe, r - kretprobe
and j - jprobe), while the third column specifies the symbol+offset of
the probe. If the probed function belongs to a module, the module name
is also specified. Following columns show probe status. If the probe is on
The second column identifies the type of probe (k - kprobe and r - kretprobe)
while the third column specifies the symbol+offset of the probe.
If the probed function belongs to a module, the module name is also
specified. Following columns show probe status. If the probe is on
a virtual address that is no longer valid (module init sections, module
virtual addresses that correspond to modules that've been unloaded),
such probes are marked with [GONE]. If the probe is temporarily disabled,
+1 -1
View File
@@ -91,7 +91,7 @@ config STATIC_KEYS_SELFTEST
config OPTPROBES
def_bool y
depends on KPROBES && HAVE_OPTPROBES
depends on !PREEMPT
select TASKS_RCU if PREEMPT
config KPROBES_ON_FTRACE
def_bool y
+1 -58
View File
@@ -227,7 +227,6 @@ static bool test_regs_ok;
static int test_func_instance;
static int pre_handler_called;
static int post_handler_called;
static int jprobe_func_called;
static int kretprobe_handler_called;
static int tests_failed;
@@ -370,50 +369,6 @@ static int test_kprobe(long (*func)(long, long))
return 0;
}
static void __kprobes jprobe_func(long r0, long r1)
{
jprobe_func_called = test_func_instance;
if (r0 == FUNC_ARG1 && r1 == FUNC_ARG2)
test_regs_ok = true;
jprobe_return();
}
static struct jprobe the_jprobe = {
.entry = jprobe_func,
};
static int test_jprobe(long (*func)(long, long))
{
int ret;
the_jprobe.kp.addr = (kprobe_opcode_t *)func;
ret = register_jprobe(&the_jprobe);
if (ret < 0) {
pr_err("FAIL: register_jprobe failed with %d\n", ret);
return ret;
}
ret = call_test_func(func, true);
unregister_jprobe(&the_jprobe);
the_jprobe.kp.flags = 0; /* Clear disable flag to allow reuse */
if (!ret)
return -EINVAL;
if (jprobe_func_called != test_func_instance) {
pr_err("FAIL: jprobe handler function not called\n");
return -EINVAL;
}
if (!call_test_func(func, false))
return -EINVAL;
if (jprobe_func_called == test_func_instance) {
pr_err("FAIL: probe called after unregistering\n");
return -EINVAL;
}
return 0;
}
static int __kprobes
kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
@@ -451,7 +406,7 @@ static int test_kretprobe(long (*func)(long, long))
}
if (!call_test_func(func, false))
return -EINVAL;
if (jprobe_func_called == test_func_instance) {
if (kretprobe_handler_called == test_func_instance) {
pr_err("FAIL: kretprobe called after unregistering\n");
return -EINVAL;
}
@@ -468,18 +423,6 @@ static int run_api_tests(long (*func)(long, long))
if (ret < 0)
return ret;
pr_info(" jprobe\n");
ret = test_jprobe(func);
#if defined(CONFIG_THUMB2_KERNEL) && !defined(MODULE)
if (ret == -EINVAL) {
pr_err("FAIL: Known longtime bug with jprobe on Thumb kernels\n");
tests_failed = ret;
ret = 0;
}
#endif
if (ret < 0)
return ret;
pr_info(" kretprobe\n");
ret = test_kretprobe(func);
if (ret < 0)
+4
View File
@@ -2958,6 +2958,10 @@ static unsigned long intel_pmu_free_running_flags(struct perf_event *event)
if (event->attr.use_clockid)
flags &= ~PERF_SAMPLE_TIME;
if (!event->attr.exclude_kernel)
flags &= ~PERF_SAMPLE_REGS_USER;
if (event->attr.sample_regs_user & ~PEBS_REGS)
flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
return flags;
}
+23 -1
View File
@@ -85,13 +85,15 @@ struct amd_nb {
* Flags PEBS can handle without an PMI.
*
* TID can only be handled by flushing at context switch.
* REGS_USER can be handled for events limited to ring 3.
*
*/
#define PEBS_FREERUNNING_FLAGS \
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR)
PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | \
PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)
/*
* A debug store configuration.
@@ -110,6 +112,26 @@ struct debug_store {
u64 pebs_event_reset[MAX_PEBS_EVENTS];
};
#define PEBS_REGS \
(PERF_REG_X86_AX | \
PERF_REG_X86_BX | \
PERF_REG_X86_CX | \
PERF_REG_X86_DX | \
PERF_REG_X86_DI | \
PERF_REG_X86_SI | \
PERF_REG_X86_SP | \
PERF_REG_X86_BP | \
PERF_REG_X86_IP | \
PERF_REG_X86_FLAGS | \
PERF_REG_X86_R8 | \
PERF_REG_X86_R9 | \
PERF_REG_X86_R10 | \
PERF_REG_X86_R11 | \
PERF_REG_X86_R12 | \
PERF_REG_X86_R13 | \
PERF_REG_X86_R14 | \
PERF_REG_X86_R15)
/*
* Per register state.
*/
+2 -2
View File
@@ -58,8 +58,8 @@ extern __visible kprobe_opcode_t optprobe_template_call[];
extern __visible kprobe_opcode_t optprobe_template_end[];
#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
#define MAX_OPTINSN_SIZE \
(((unsigned long)&optprobe_template_end - \
(unsigned long)&optprobe_template_entry) + \
(((unsigned long)optprobe_template_end - \
(unsigned long)optprobe_template_entry) + \
MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
extern const int kretprobe_blacklist_size;
+3 -3
View File
@@ -85,11 +85,11 @@ extern unsigned long recover_probed_instruction(kprobe_opcode_t *buf,
* Copy an instruction and adjust the displacement if the instruction
* uses the %rip-relative addressing mode.
*/
extern int __copy_instruction(u8 *dest, u8 *src, struct insn *insn);
extern int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn);
/* Generate a relative-jump/call instruction */
extern void synthesize_reljump(void *from, void *to);
extern void synthesize_relcall(void *from, void *to);
extern void synthesize_reljump(void *dest, void *from, void *to);
extern void synthesize_relcall(void *dest, void *from, void *to);
#ifdef CONFIG_OPTPROBES
extern int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter);
+40 -21
View File
@@ -119,29 +119,29 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {
const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);
static nokprobe_inline void
__synthesize_relative_insn(void *from, void *to, u8 op)
__synthesize_relative_insn(void *dest, void *from, void *to, u8 op)
{
struct __arch_relative_insn {
u8 op;
s32 raddr;
} __packed *insn;
insn = (struct __arch_relative_insn *)from;
insn = (struct __arch_relative_insn *)dest;
insn->raddr = (s32)((long)(to) - ((long)(from) + 5));
insn->op = op;
}
/* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
void synthesize_reljump(void *from, void *to)
void synthesize_reljump(void *dest, void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVEJUMP_OPCODE);
__synthesize_relative_insn(dest, from, to, RELATIVEJUMP_OPCODE);
}
NOKPROBE_SYMBOL(synthesize_reljump);
/* Insert a call instruction at address 'from', which calls address 'to'.*/
void synthesize_relcall(void *from, void *to)
void synthesize_relcall(void *dest, void *from, void *to)
{
__synthesize_relative_insn(from, to, RELATIVECALL_OPCODE);
__synthesize_relative_insn(dest, from, to, RELATIVECALL_OPCODE);
}
NOKPROBE_SYMBOL(synthesize_relcall);
@@ -346,10 +346,11 @@ static int is_IF_modifier(kprobe_opcode_t *insn)
/*
* Copy an instruction with recovering modified instruction by kprobes
* and adjust the displacement if the instruction uses the %rip-relative
* addressing mode.
* addressing mode. Note that since @real will be the final place of copied
* instruction, displacement must be adjust by @real, not @dest.
* This returns the length of copied instruction, or 0 if it has an error.
*/
int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
{
kprobe_opcode_t buf[MAX_INSN_SIZE];
unsigned long recovered_insn =
@@ -387,11 +388,11 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
* have given.
*/
newdisp = (u8 *) src + (s64) insn->displacement.value
- (u8 *) dest;
- (u8 *) real;
if ((s64) (s32) newdisp != newdisp) {
pr_err("Kprobes error: new displacement does not fit into s32 (%llx)\n", newdisp);
pr_err("\tSrc: %p, Dest: %p, old disp: %x\n",
src, dest, insn->displacement.value);
src, real, insn->displacement.value);
return 0;
}
disp = (u8 *) dest + insn_offset_displacement(insn);
@@ -402,20 +403,38 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
}
/* Prepare reljump right after instruction to boost */
static void prepare_boost(struct kprobe *p, struct insn *insn)
static int prepare_boost(kprobe_opcode_t *buf, struct kprobe *p,
struct insn *insn)
{
int len = insn->length;
if (can_boost(insn, p->addr) &&
MAX_INSN_SIZE - insn->length >= RELATIVEJUMP_SIZE) {
MAX_INSN_SIZE - len >= RELATIVEJUMP_SIZE) {
/*
* These instructions can be executed directly if it
* jumps back to correct address.
*/
synthesize_reljump(p->ainsn.insn + insn->length,
synthesize_reljump(buf + len, p->ainsn.insn + len,
p->addr + insn->length);
len += RELATIVEJUMP_SIZE;
p->ainsn.boostable = true;
} else {
p->ainsn.boostable = false;
}
return len;
}
/* Make page to RO mode when allocate it */
void *alloc_insn_page(void)
{
void *page;
page = module_alloc(PAGE_SIZE);
if (page)
set_memory_ro((unsigned long)page & PAGE_MASK, 1);
return page;
}
/* Recover page to RW mode before releasing it */
@@ -429,12 +448,11 @@ void free_insn_page(void *page)
static int arch_copy_kprobe(struct kprobe *p)
{
struct insn insn;
kprobe_opcode_t buf[MAX_INSN_SIZE];
int len;
set_memory_rw((unsigned long)p->ainsn.insn & PAGE_MASK, 1);
/* Copy an instruction with recovering if other optprobe modifies it.*/
len = __copy_instruction(p->ainsn.insn, p->addr, &insn);
len = __copy_instruction(buf, p->addr, p->ainsn.insn, &insn);
if (!len)
return -EINVAL;
@@ -442,15 +460,16 @@ static int arch_copy_kprobe(struct kprobe *p)
* __copy_instruction can modify the displacement of the instruction,
* but it doesn't affect boostable check.
*/
prepare_boost(p, &insn);
set_memory_ro((unsigned long)p->ainsn.insn & PAGE_MASK, 1);
len = prepare_boost(buf, p, &insn);
/* Check whether the instruction modifies Interrupt Flag or not */
p->ainsn.if_modifier = is_IF_modifier(p->ainsn.insn);
p->ainsn.if_modifier = is_IF_modifier(buf);
/* Also, displacement change doesn't affect the first byte */
p->opcode = p->ainsn.insn[0];
p->opcode = buf[0];
/* OK, write back the instruction(s) into ROX insn buffer */
text_poke(p->ainsn.insn, buf, len);
return 0;
}
+16 -16
View File
@@ -26,7 +26,7 @@
#include "common.h"
static nokprobe_inline
int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
void __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb, unsigned long orig_ip)
{
/*
@@ -41,33 +41,31 @@ int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
__this_cpu_write(current_kprobe, NULL);
if (orig_ip)
regs->ip = orig_ip;
return 1;
}
int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
struct kprobe_ctlblk *kcb)
{
if (kprobe_ftrace(p))
return __skip_singlestep(p, regs, kcb, 0);
else
return 0;
if (kprobe_ftrace(p)) {
__skip_singlestep(p, regs, kcb, 0);
preempt_enable_no_resched();
return 1;
}
return 0;
}
NOKPROBE_SYMBOL(skip_singlestep);
/* Ftrace callback handler for kprobes */
/* Ftrace callback handler for kprobes -- called under preepmt disabed */
void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *ops, struct pt_regs *regs)
{
struct kprobe *p;
struct kprobe_ctlblk *kcb;
unsigned long flags;
/* Disable irq for emulating a breakpoint and avoiding preempt */
local_irq_save(flags);
/* Preempt is disabled by ftrace */
p = get_kprobe((kprobe_opcode_t *)ip);
if (unlikely(!p) || kprobe_disabled(p))
goto end;
return;
kcb = get_kprobe_ctlblk();
if (kprobe_running()) {
@@ -77,17 +75,19 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
/* Kprobe handler expects regs->ip = ip + 1 as breakpoint hit */
regs->ip = ip + sizeof(kprobe_opcode_t);
/* To emulate trap based kprobes, preempt_disable here */
preempt_disable();
__this_cpu_write(current_kprobe, p);
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
if (!p->pre_handler || !p->pre_handler(p, regs))
if (!p->pre_handler || !p->pre_handler(p, regs)) {
__skip_singlestep(p, regs, kcb, orig_ip);
preempt_enable_no_resched();
}
/*
* If pre_handler returns !0, it sets regs->ip and
* resets current kprobe.
* resets current kprobe, and keep preempt count +1.
*/
}
end:
local_irq_restore(flags);
}
NOKPROBE_SYMBOL(kprobe_ftrace_handler);
+43 -36
View File
@@ -142,11 +142,11 @@ void optprobe_template_func(void);
STACK_FRAME_NON_STANDARD(optprobe_template_func);
#define TMPL_MOVE_IDX \
((long)&optprobe_template_val - (long)&optprobe_template_entry)
((long)optprobe_template_val - (long)optprobe_template_entry)
#define TMPL_CALL_IDX \
((long)&optprobe_template_call - (long)&optprobe_template_entry)
((long)optprobe_template_call - (long)optprobe_template_entry)
#define TMPL_END_IDX \
((long)&optprobe_template_end - (long)&optprobe_template_entry)
((long)optprobe_template_end - (long)optprobe_template_entry)
#define INT3_SIZE sizeof(kprobe_opcode_t)
@@ -154,17 +154,15 @@ STACK_FRAME_NON_STANDARD(optprobe_template_func);
static void
optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
{
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
unsigned long flags;
/* This is possible if op is under delayed unoptimizing */
if (kprobe_disabled(&op->kp))
return;
local_irq_save(flags);
preempt_disable();
if (kprobe_running()) {
kprobes_inc_nmissed_count(&op->kp);
} else {
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
/* Save skipped registers */
#ifdef CONFIG_X86_64
regs->cs = __KERNEL_CS;
@@ -180,17 +178,17 @@ optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
opt_pre_handler(&op->kp, regs);
__this_cpu_write(current_kprobe, NULL);
}
local_irq_restore(flags);
preempt_enable_no_resched();
}
NOKPROBE_SYMBOL(optimized_callback);
static int copy_optimized_instructions(u8 *dest, u8 *src)
static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
{
struct insn insn;
int len = 0, ret;
while (len < RELATIVEJUMP_SIZE) {
ret = __copy_instruction(dest + len, src + len, &insn);
ret = __copy_instruction(dest + len, src + len, real, &insn);
if (!ret || !can_boost(&insn, src + len))
return -EINVAL;
len += ret;
@@ -343,57 +341,66 @@ void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
struct kprobe *__unused)
{
u8 *buf;
int ret;
u8 *buf = NULL, *slot;
int ret, len;
long rel;
if (!can_optimize((unsigned long)op->kp.addr))
return -EILSEQ;
op->optinsn.insn = get_optinsn_slot();
if (!op->optinsn.insn)
buf = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
if (!buf)
return -ENOMEM;
op->optinsn.insn = slot = get_optinsn_slot();
if (!slot) {
ret = -ENOMEM;
goto out;
}
/*
* Verify if the address gap is in 2GB range, because this uses
* a relative jump.
*/
rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
rel = (long)slot - (long)op->kp.addr + RELATIVEJUMP_SIZE;
if (abs(rel) > 0x7fffffff) {
__arch_remove_optimized_kprobe(op, 0);
return -ERANGE;
ret = -ERANGE;
goto err;
}
buf = (u8 *)op->optinsn.insn;
set_memory_rw((unsigned long)buf & PAGE_MASK, 1);
/* Copy instructions into the out-of-line buffer */
ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
if (ret < 0) {
__arch_remove_optimized_kprobe(op, 0);
return ret;
}
op->optinsn.size = ret;
/* Copy arch-dep-instance from template */
memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
memcpy(buf, optprobe_template_entry, TMPL_END_IDX);
/* Copy instructions into the out-of-line buffer */
ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr,
slot + TMPL_END_IDX);
if (ret < 0)
goto err;
op->optinsn.size = ret;
len = TMPL_END_IDX + op->optinsn.size;
/* Set probe information */
synthesize_set_arg1(buf + TMPL_MOVE_IDX, (unsigned long)op);
/* Set probe function call */
synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
synthesize_relcall(buf + TMPL_CALL_IDX,
slot + TMPL_CALL_IDX, optimized_callback);
/* Set returning jmp instruction at the tail of out-of-line buffer */
synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
synthesize_reljump(buf + len, slot + len,
(u8 *)op->kp.addr + op->optinsn.size);
len += RELATIVEJUMP_SIZE;
set_memory_ro((unsigned long)buf & PAGE_MASK, 1);
/* We have to use text_poke for instuction buffer because it is RO */
text_poke(slot, buf, len);
ret = 0;
out:
kfree(buf);
return ret;
flush_icache_range((unsigned long) buf,
(unsigned long) buf + TMPL_END_IDX +
op->optinsn.size + RELATIVEJUMP_SIZE);
return 0;
err:
__arch_remove_optimized_kprobe(op, 0);
goto out;
}
/*
+45 -109
View File
@@ -56,122 +56,54 @@ static ssize_t direct_entry(struct file *f, const char __user *user_buf,
size_t count, loff_t *off);
#ifdef CONFIG_KPROBES
static void lkdtm_handler(void);
static int lkdtm_kprobe_handler(struct kprobe *kp, struct pt_regs *regs);
static ssize_t lkdtm_debugfs_entry(struct file *f,
const char __user *user_buf,
size_t count, loff_t *off);
/* jprobe entry point handlers. */
static unsigned int jp_do_irq(unsigned int irq)
{
lkdtm_handler();
jprobe_return();
return 0;
}
static irqreturn_t jp_handle_irq_event(unsigned int irq,
struct irqaction *action)
{
lkdtm_handler();
jprobe_return();
return 0;
}
static void jp_tasklet_action(struct softirq_action *a)
{
lkdtm_handler();
jprobe_return();
}
static void jp_ll_rw_block(int rw, int nr, struct buffer_head *bhs[])
{
lkdtm_handler();
jprobe_return();
}
struct scan_control;
static unsigned long jp_shrink_inactive_list(unsigned long max_scan,
struct zone *zone,
struct scan_control *sc)
{
lkdtm_handler();
jprobe_return();
return 0;
}
static int jp_hrtimer_start(struct hrtimer *timer, ktime_t tim,
const enum hrtimer_mode mode)
{
lkdtm_handler();
jprobe_return();
return 0;
}
static int jp_scsi_dispatch_cmd(struct scsi_cmnd *cmd)
{
lkdtm_handler();
jprobe_return();
return 0;
}
# ifdef CONFIG_IDE
static int jp_generic_ide_ioctl(ide_drive_t *drive, struct file *file,
struct block_device *bdev, unsigned int cmd,
unsigned long arg)
{
lkdtm_handler();
jprobe_return();
return 0;
}
# endif
# define CRASHPOINT_KPROBE(_symbol) \
.kprobe = { \
.symbol_name = (_symbol), \
.pre_handler = lkdtm_kprobe_handler, \
},
# define CRASHPOINT_WRITE(_symbol) \
(_symbol) ? lkdtm_debugfs_entry : direct_entry
#else
# define CRASHPOINT_KPROBE(_symbol)
# define CRASHPOINT_WRITE(_symbol) direct_entry
#endif
/* Crash points */
struct crashpoint {
const char *name;
const struct file_operations fops;
struct jprobe jprobe;
struct kprobe kprobe;
};
#define CRASHPOINT(_name, _write, _symbol, _entry) \
#define CRASHPOINT(_name, _symbol) \
{ \
.name = _name, \
.fops = { \
.read = lkdtm_debugfs_read, \
.llseek = generic_file_llseek, \
.open = lkdtm_debugfs_open, \
.write = _write, \
}, \
.jprobe = { \
.kp.symbol_name = _symbol, \
.entry = (kprobe_opcode_t *)_entry, \
.write = CRASHPOINT_WRITE(_symbol) \
}, \
CRASHPOINT_KPROBE(_symbol) \
}
/* Define the possible places where we can trigger a crash point. */
struct crashpoint crashpoints[] = {
CRASHPOINT("DIRECT", direct_entry,
NULL, NULL),
static struct crashpoint crashpoints[] = {
CRASHPOINT("DIRECT", NULL),
#ifdef CONFIG_KPROBES
CRASHPOINT("INT_HARDWARE_ENTRY", lkdtm_debugfs_entry,
"do_IRQ", jp_do_irq),
CRASHPOINT("INT_HW_IRQ_EN", lkdtm_debugfs_entry,
"handle_IRQ_event", jp_handle_irq_event),
CRASHPOINT("INT_TASKLET_ENTRY", lkdtm_debugfs_entry,
"tasklet_action", jp_tasklet_action),
CRASHPOINT("FS_DEVRW", lkdtm_debugfs_entry,
"ll_rw_block", jp_ll_rw_block),
CRASHPOINT("MEM_SWAPOUT", lkdtm_debugfs_entry,
"shrink_inactive_list", jp_shrink_inactive_list),
CRASHPOINT("TIMERADD", lkdtm_debugfs_entry,
"hrtimer_start", jp_hrtimer_start),
CRASHPOINT("SCSI_DISPATCH_CMD", lkdtm_debugfs_entry,
"scsi_dispatch_cmd", jp_scsi_dispatch_cmd),
CRASHPOINT("INT_HARDWARE_ENTRY", "do_IRQ"),
CRASHPOINT("INT_HW_IRQ_EN", "handle_IRQ_event"),
CRASHPOINT("INT_TASKLET_ENTRY", "tasklet_action"),
CRASHPOINT("FS_DEVRW", "ll_rw_block"),
CRASHPOINT("MEM_SWAPOUT", "shrink_inactive_list"),
CRASHPOINT("TIMERADD", "hrtimer_start"),
CRASHPOINT("SCSI_DISPATCH_CMD", "scsi_dispatch_cmd"),
# ifdef CONFIG_IDE
CRASHPOINT("IDE_CORE_CP", lkdtm_debugfs_entry,
"generic_ide_ioctl", jp_generic_ide_ioctl),
CRASHPOINT("IDE_CORE_CP", "generic_ide_ioctl"),
# endif
#endif
};
@@ -254,8 +186,8 @@ struct crashtype crashtypes[] = {
};
/* Global jprobe entry and crashtype. */
static struct jprobe *lkdtm_jprobe;
/* Global kprobe entry and crashtype. */
static struct kprobe *lkdtm_kprobe;
struct crashpoint *lkdtm_crashpoint;
struct crashtype *lkdtm_crashtype;
@@ -298,7 +230,8 @@ static struct crashtype *find_crashtype(const char *name)
*/
static noinline void lkdtm_do_action(struct crashtype *crashtype)
{
BUG_ON(!crashtype || !crashtype->func);
if (WARN_ON(!crashtype || !crashtype->func))
return;
crashtype->func();
}
@@ -308,22 +241,22 @@ static int lkdtm_register_cpoint(struct crashpoint *crashpoint,
int ret;
/* If this doesn't have a symbol, just call immediately. */
if (!crashpoint->jprobe.kp.symbol_name) {
if (!crashpoint->kprobe.symbol_name) {
lkdtm_do_action(crashtype);
return 0;
}
if (lkdtm_jprobe != NULL)
unregister_jprobe(lkdtm_jprobe);
if (lkdtm_kprobe != NULL)
unregister_kprobe(lkdtm_kprobe);
lkdtm_crashpoint = crashpoint;
lkdtm_crashtype = crashtype;
lkdtm_jprobe = &crashpoint->jprobe;
ret = register_jprobe(lkdtm_jprobe);
lkdtm_kprobe = &crashpoint->kprobe;
ret = register_kprobe(lkdtm_kprobe);
if (ret < 0) {
pr_info("Couldn't register jprobe %s\n",
crashpoint->jprobe.kp.symbol_name);
lkdtm_jprobe = NULL;
pr_info("Couldn't register kprobe %s\n",
crashpoint->kprobe.symbol_name);
lkdtm_kprobe = NULL;
lkdtm_crashpoint = NULL;
lkdtm_crashtype = NULL;
}
@@ -336,13 +269,14 @@ static int lkdtm_register_cpoint(struct crashpoint *crashpoint,
static int crash_count = DEFAULT_COUNT;
static DEFINE_SPINLOCK(crash_count_lock);
/* Called by jprobe entry points. */
static void lkdtm_handler(void)
/* Called by kprobe entry points. */
static int lkdtm_kprobe_handler(struct kprobe *kp, struct pt_regs *regs)
{
unsigned long flags;
bool do_it = false;
BUG_ON(!lkdtm_crashpoint || !lkdtm_crashtype);
if (WARN_ON(!lkdtm_crashpoint || !lkdtm_crashtype))
return 0;
spin_lock_irqsave(&crash_count_lock, flags);
crash_count--;
@@ -357,6 +291,8 @@ static void lkdtm_handler(void)
if (do_it)
lkdtm_do_action(lkdtm_crashtype);
return 0;
}
static ssize_t lkdtm_debugfs_entry(struct file *f,
@@ -556,8 +492,8 @@ static void __exit lkdtm_module_exit(void)
/* Handle test-specific clean-up. */
lkdtm_usercopy_exit();
if (lkdtm_jprobe != NULL)
unregister_jprobe(lkdtm_jprobe);
if (lkdtm_kprobe != NULL)
unregister_kprobe(lkdtm_kprobe);
pr_info("Crash point unregistered\n");
}
+16 -20
View File
@@ -391,10 +391,6 @@ int register_kprobes(struct kprobe **kps, int num);
void unregister_kprobes(struct kprobe **kps, int num);
int setjmp_pre_handler(struct kprobe *, struct pt_regs *);
int longjmp_break_handler(struct kprobe *, struct pt_regs *);
int register_jprobe(struct jprobe *p);
void unregister_jprobe(struct jprobe *p);
int register_jprobes(struct jprobe **jps, int num);
void unregister_jprobes(struct jprobe **jps, int num);
void jprobe_return(void);
unsigned long arch_deref_entry_point(void *);
@@ -443,20 +439,6 @@ static inline void unregister_kprobe(struct kprobe *p)
static inline void unregister_kprobes(struct kprobe **kps, int num)
{
}
static inline int register_jprobe(struct jprobe *p)
{
return -ENOSYS;
}
static inline int register_jprobes(struct jprobe **jps, int num)
{
return -ENOSYS;
}
static inline void unregister_jprobe(struct jprobe *p)
{
}
static inline void unregister_jprobes(struct jprobe **jps, int num)
{
}
static inline void jprobe_return(void)
{
}
@@ -486,6 +468,20 @@ static inline int enable_kprobe(struct kprobe *kp)
return -ENOSYS;
}
#endif /* CONFIG_KPROBES */
static inline int register_jprobe(struct jprobe *p)
{
return -ENOSYS;
}
static inline int register_jprobes(struct jprobe **jps, int num)
{
return -ENOSYS;
}
static inline void unregister_jprobe(struct jprobe *p)
{
}
static inline void unregister_jprobes(struct jprobe **jps, int num)
{
}
static inline int disable_kretprobe(struct kretprobe *rp)
{
return disable_kprobe(&rp->kp);
@@ -496,11 +492,11 @@ static inline int enable_kretprobe(struct kretprobe *rp)
}
static inline int disable_jprobe(struct jprobe *jp)
{
return disable_kprobe(&jp->kp);
return -ENOSYS;
}
static inline int enable_jprobe(struct jprobe *jp)
{
return enable_kprobe(&jp->kp);
return -ENOSYS;
}
#ifndef CONFIG_KPROBES
+9 -23
View File
@@ -485,9 +485,9 @@ struct perf_addr_filters_head {
};
/**
* enum perf_event_active_state - the states of a event
* enum perf_event_state - the states of a event
*/
enum perf_event_active_state {
enum perf_event_state {
PERF_EVENT_STATE_DEAD = -4,
PERF_EVENT_STATE_EXIT = -3,
PERF_EVENT_STATE_ERROR = -2,
@@ -578,7 +578,7 @@ struct perf_event {
struct pmu *pmu;
void *pmu_private;
enum perf_event_active_state state;
enum perf_event_state state;
unsigned int attach_state;
local64_t count;
atomic64_t child_count;
@@ -588,26 +588,10 @@ struct perf_event {
* has been enabled (i.e. eligible to run, and the task has
* been scheduled in, if this is a per-task event)
* and running (scheduled onto the CPU), respectively.
*
* They are computed from tstamp_enabled, tstamp_running and
* tstamp_stopped when the event is in INACTIVE or ACTIVE state.
*/
u64 total_time_enabled;
u64 total_time_running;
/*
* These are timestamps used for computing total_time_enabled
* and total_time_running when the event is in INACTIVE or
* ACTIVE state, measured in nanoseconds from an arbitrary point
* in time.
* tstamp_enabled: the notional time when the event was enabled
* tstamp_running: the notional time when the event was scheduled on
* tstamp_stopped: in INACTIVE state, the notional time when the
* event was scheduled off.
*/
u64 tstamp_enabled;
u64 tstamp_running;
u64 tstamp_stopped;
u64 tstamp;
/*
* timestamp shadows the actual context timing but it can
@@ -699,7 +683,6 @@ struct perf_event {
#ifdef CONFIG_CGROUP_PERF
struct perf_cgroup *cgrp; /* cgroup event is attach to */
int cgrp_defer_enabled;
#endif
struct list_head sb_list;
@@ -806,6 +789,7 @@ struct perf_output_handle {
struct bpf_perf_event_data_kern {
struct pt_regs *regs;
struct perf_sample_data *data;
struct perf_event *event;
};
#ifdef CONFIG_CGROUP_PERF
@@ -884,7 +868,8 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
void *context);
extern void perf_pmu_migrate_context(struct pmu *pmu,
int src_cpu, int dst_cpu);
int perf_event_read_local(struct perf_event *event, u64 *value);
int perf_event_read_local(struct perf_event *event, u64 *value,
u64 *enabled, u64 *running);
extern u64 perf_event_read_value(struct perf_event *event,
u64 *enabled, u64 *running);
@@ -1286,7 +1271,8 @@ static inline const struct perf_event_attr *perf_event_attrs(struct perf_event *
{
return ERR_PTR(-EINVAL);
}
static inline int perf_event_read_local(struct perf_event *event, u64 *value)
static inline int perf_event_read_local(struct perf_event *event, u64 *value,
u64 *enabled, u64 *running)
{
return -EINVAL;
}
+1 -1
View File
@@ -492,7 +492,7 @@ static void *perf_event_fd_array_get_ptr(struct bpf_map *map,
ee = ERR_PTR(-EOPNOTSUPP);
event = perf_file->private_data;
if (perf_event_read_local(event, &value) == -EOPNOTSUPP)
if (perf_event_read_local(event, &value, NULL, NULL) == -EOPNOTSUPP)
goto err_out;
ee = bpf_event_entry_gen(perf_file, map_file);
+199 -271
View File
File diff suppressed because it is too large Load Diff
+11 -7
View File
@@ -117,7 +117,7 @@ enum kprobe_slot_state {
SLOT_USED = 2,
};
static void *alloc_insn_page(void)
void __weak *alloc_insn_page(void)
{
return module_alloc(PAGE_SIZE);
}
@@ -573,13 +573,15 @@ static void kprobe_optimizer(struct work_struct *work)
do_unoptimize_kprobes();
/*
* Step 2: Wait for quiesence period to ensure all running interrupts
* are done. Because optprobe may modify multiple instructions
* there is a chance that Nth instruction is interrupted. In that
* case, running interrupt can return to 2nd-Nth byte of jump
* instruction. This wait is for avoiding it.
* Step 2: Wait for quiesence period to ensure all potentially
* preempted tasks to have normally scheduled. Because optprobe
* may modify multiple instructions, there is a chance that Nth
* instruction is preempted. In that case, such tasks can return
* to 2nd-Nth byte of jump instruction. This wait is for avoiding it.
* Note that on non-preemptive kernel, this is transparently converted
* to synchronoze_sched() to wait for all interrupts to have completed.
*/
synchronize_sched();
synchronize_rcu_tasks();
/* Step 3: Optimize kprobes after quiesence period */
do_optimize_kprobes();
@@ -1769,6 +1771,7 @@ unsigned long __weak arch_deref_entry_point(void *entry)
return (unsigned long)entry;
}
#if 0
int register_jprobes(struct jprobe **jps, int num)
{
int ret = 0, i;
@@ -1837,6 +1840,7 @@ void unregister_jprobes(struct jprobe **jps, int num)
}
}
EXPORT_SYMBOL_GPL(unregister_jprobes);
#endif
#ifdef CONFIG_KRETPROBES
/*
+28 -1
View File
@@ -22,7 +22,7 @@
#define div_factor 3
static u32 rand1, preh_val, posth_val, jph_val;
static u32 rand1, preh_val, posth_val;
static int errors, handler_errors, num_tests;
static u32 (*target)(u32 value);
static u32 (*target2)(u32 value);
@@ -34,6 +34,10 @@ static noinline u32 kprobe_target(u32 value)
static int kp_pre_handler(struct kprobe *p, struct pt_regs *regs)
{
if (preemptible()) {
handler_errors++;
pr_err("pre-handler is preemptible\n");
}
preh_val = (rand1 / div_factor);
return 0;
}
@@ -41,6 +45,10 @@ static int kp_pre_handler(struct kprobe *p, struct pt_regs *regs)
static void kp_post_handler(struct kprobe *p, struct pt_regs *regs,
unsigned long flags)
{
if (preemptible()) {
handler_errors++;
pr_err("post-handler is preemptible\n");
}
if (preh_val != (rand1 / div_factor)) {
handler_errors++;
pr_err("incorrect value in post_handler\n");
@@ -154,8 +162,15 @@ static int test_kprobes(void)
}
#if 0
static u32 jph_val;
static u32 j_kprobe_target(u32 value)
{
if (preemptible()) {
handler_errors++;
pr_err("jprobe-handler is preemptible\n");
}
if (value != rand1) {
handler_errors++;
pr_err("incorrect value in jprobe handler\n");
@@ -227,11 +242,19 @@ static int test_jprobes(void)
return 0;
}
#else
#define test_jprobe() (0)
#define test_jprobes() (0)
#endif
#ifdef CONFIG_KRETPROBES
static u32 krph_val;
static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
if (preemptible()) {
handler_errors++;
pr_err("kretprobe entry handler is preemptible\n");
}
krph_val = (rand1 / div_factor);
return 0;
}
@@ -240,6 +263,10 @@ static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
{
unsigned long ret = regs_return_value(regs);
if (preemptible()) {
handler_errors++;
pr_err("kretprobe return handler is preemptible\n");
}
if (ret != (rand1 / div_factor)) {
handler_errors++;
pr_err("incorrect value in kretprobe handler\n");
+1 -1
View File
@@ -275,7 +275,7 @@ BPF_CALL_2(bpf_perf_event_read, struct bpf_map *, map, u64, flags)
if (!ee)
return -ENOENT;
err = perf_event_read_local(ee->event, &value);
err = perf_event_read_local(ee->event, &value, NULL, NULL);
/*
* this api is ugly since we miss [-22..-2] range of valid
* counter values, but that's uapi
+1 -1
View File
@@ -1,5 +1,5 @@
# builds the kprobes example kernel modules;
# then to use one (as root): insmod <module_name.ko>
obj-$(CONFIG_SAMPLE_KPROBES) += kprobe_example.o jprobe_example.o
obj-$(CONFIG_SAMPLE_KPROBES) += kprobe_example.o
obj-$(CONFIG_SAMPLE_KRETPROBES) += kretprobe_example.o
-67
View File
@@ -1,67 +0,0 @@
/*
* Here's a sample kernel module showing the use of jprobes to dump
* the arguments of _do_fork().
*
* For more information on theory of operation of jprobes, see
* Documentation/kprobes.txt
*
* Build and insert the kernel module as done in the kprobe example.
* You will see the trace data in /var/log/messages and on the
* console whenever _do_fork() is invoked to create a new process.
* (Some messages may be suppressed if syslogd is configured to
* eliminate duplicate messages.)
*/
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/kprobes.h>
/*
* Jumper probe for _do_fork.
* Mirror principle enables access to arguments of the probed routine
* from the probe handler.
*/
/* Proxy routine having the same arguments as actual _do_fork() routine */
static long j_do_fork(unsigned long clone_flags, unsigned long stack_start,
unsigned long stack_size, int __user *parent_tidptr,
int __user *child_tidptr, unsigned long tls)
{
pr_info("jprobe: clone_flags = 0x%lx, stack_start = 0x%lx "
"stack_size = 0x%lx\n", clone_flags, stack_start, stack_size);
/* Always end with a call to jprobe_return(). */
jprobe_return();
return 0;
}
static struct jprobe my_jprobe = {
.entry = j_do_fork,
.kp = {
.symbol_name = "_do_fork",
},
};
static int __init jprobe_init(void)
{
int ret;
ret = register_jprobe(&my_jprobe);
if (ret < 0) {
pr_err("register_jprobe failed, returned %d\n", ret);
return -1;
}
pr_info("Planted jprobe at %p, handler addr %p\n",
my_jprobe.kp.addr, my_jprobe.entry);
return 0;
}
static void __exit jprobe_exit(void)
{
unregister_jprobe(&my_jprobe);
pr_info("jprobe at %p unregistered\n", my_jprobe.kp.addr);
}
module_init(jprobe_init)
module_exit(jprobe_exit)
MODULE_LICENSE("GPL");

Some files were not shown because too many files have changed in this diff Show More