You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main changes in this cycle were:
Kernel:
- kprobes updates: use better W^X patterns for code modifications,
improve optprobes, remove jprobes. (Masami Hiramatsu, Kees Cook)
- core fixes: event timekeeping (enabled/running times statistics)
fixes, perf_event_read() locking fixes and cleanups, etc. (Peter
Zijlstra)
- Extend x86 Intel free-running PEBS support and support x86
user-register sampling in perf record and perf script. (Andi Kleen)
Tooling:
- Completely rework the way inline frames are handled. Instead of
querying for the inline nodes on-demand in the individual tools, we
now create proper callchain nodes for inlined frames. (Milian
Wolff)
- 'perf trace' updates (Arnaldo Carvalho de Melo)
- Implement a way to print formatted output to per-event files in
'perf script' to facilitate generate flamegraphs, elliminating the
need to write scripts to do that separation (yuzhoujian, Arnaldo
Carvalho de Melo)
- Update vendor events JSON metrics for Intel's Broadwell, Broadwell
Server, Haswell, Haswell Server, IvyBridge, IvyTown, JakeTown,
Sandy Bridge, Skylake, SkyLake Server - and Goldmont Plus V1 (Andi
Kleen, Kan Liang)
- Multithread the synthesizing of PERF_RECORD_ events for
pre-existing threads in 'perf top', speeding up that phase, greatly
improving the user experience in systems such as Intel's Knights
Mill (Kan Liang)
- Introduce the concept of weak groups in 'perf stat': try to set up
a group, but if it's not schedulable fallback to not using a group.
That gives us the best of both worlds: groups if they work, but
still a usable fallback if they don't. E.g: (Andi Kleen)
- perf sched timehist enhancements (David Ahern)
- ... various other enhancements, updates, cleanups and fixes"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (139 commits)
kprobes: Don't spam the build log with deprecation warnings
arm/kprobes: Remove jprobe test case
arm/kprobes: Fix kretprobe test to check correct counter
perf srcline: Show correct function name for srcline of callchains
perf srcline: Fix memory leak in addr2inlines()
perf trace beauty kcmp: Beautify arguments
perf trace beauty: Implement pid_fd beautifier
tools include uapi: Grab a copy of linux/kcmp.h
perf callchain: Fix double mapping al->addr for children without self period
perf stat: Make --per-thread update shadow stats to show metrics
perf stat: Move the shadow stats scale computation in perf_stat__update_shadow_stats
perf tools: Add perf_data_file__write function
perf tools: Add struct perf_data_file
perf tools: Rename struct perf_data_file to perf_data
perf script: Print information about per-event-dump files
perf trace beauty prctl: Generate 'option' string table from kernel headers
tools include uapi: Grab a copy of linux/prctl.h
perf script: Allow creating per-event dump files
perf evsel: Restore evsel->priv as a tool private area
perf script: Use event_format__fprintf()
...
This commit is contained in:
+57
-102
@@ -8,7 +8,7 @@ Kernel Probes (Kprobes)
|
||||
|
||||
.. CONTENTS
|
||||
|
||||
1. Concepts: Kprobes, Jprobes, Return Probes
|
||||
1. Concepts: Kprobes, and Return Probes
|
||||
2. Architectures Supported
|
||||
3. Configuring Kprobes
|
||||
4. API Reference
|
||||
@@ -16,12 +16,12 @@ Kernel Probes (Kprobes)
|
||||
6. Probe Overhead
|
||||
7. TODO
|
||||
8. Kprobes Example
|
||||
9. Jprobes Example
|
||||
10. Kretprobes Example
|
||||
9. Kretprobes Example
|
||||
10. Deprecated Features
|
||||
Appendix A: The kprobes debugfs interface
|
||||
Appendix B: The kprobes sysctl interface
|
||||
|
||||
Concepts: Kprobes, Jprobes, Return Probes
|
||||
Concepts: Kprobes and Return Probes
|
||||
=========================================
|
||||
|
||||
Kprobes enables you to dynamically break into any kernel routine and
|
||||
@@ -32,12 +32,10 @@ routine to be invoked when the breakpoint is hit.
|
||||
.. [1] some parts of the kernel code can not be trapped, see
|
||||
:ref:`kprobes_blacklist`)
|
||||
|
||||
There are currently three types of probes: kprobes, jprobes, and
|
||||
kretprobes (also called return probes). A kprobe can be inserted
|
||||
on virtually any instruction in the kernel. A jprobe is inserted at
|
||||
the entry to a kernel function, and provides convenient access to the
|
||||
function's arguments. A return probe fires when a specified function
|
||||
returns.
|
||||
There are currently two types of probes: kprobes, and kretprobes
|
||||
(also called return probes). A kprobe can be inserted on virtually
|
||||
any instruction in the kernel. A return probe fires when a specified
|
||||
function returns.
|
||||
|
||||
In the typical case, Kprobes-based instrumentation is packaged as
|
||||
a kernel module. The module's init function installs ("registers")
|
||||
@@ -82,45 +80,6 @@ After the instruction is single-stepped, Kprobes executes the
|
||||
"post_handler," if any, that is associated with the kprobe.
|
||||
Execution then continues with the instruction following the probepoint.
|
||||
|
||||
How Does a Jprobe Work?
|
||||
-----------------------
|
||||
|
||||
A jprobe is implemented using a kprobe that is placed on a function's
|
||||
entry point. It employs a simple mirroring principle to allow
|
||||
seamless access to the probed function's arguments. The jprobe
|
||||
handler routine should have the same signature (arg list and return
|
||||
type) as the function being probed, and must always end by calling
|
||||
the Kprobes function jprobe_return().
|
||||
|
||||
Here's how it works. When the probe is hit, Kprobes makes a copy of
|
||||
the saved registers and a generous portion of the stack (see below).
|
||||
Kprobes then points the saved instruction pointer at the jprobe's
|
||||
handler routine, and returns from the trap. As a result, control
|
||||
passes to the handler, which is presented with the same register and
|
||||
stack contents as the probed function. When it is done, the handler
|
||||
calls jprobe_return(), which traps again to restore the original stack
|
||||
contents and processor state and switch to the probed function.
|
||||
|
||||
By convention, the callee owns its arguments, so gcc may produce code
|
||||
that unexpectedly modifies that portion of the stack. This is why
|
||||
Kprobes saves a copy of the stack and restores it after the jprobe
|
||||
handler has run. Up to MAX_STACK_SIZE bytes are copied -- e.g.,
|
||||
64 bytes on i386.
|
||||
|
||||
Note that the probed function's args may be passed on the stack
|
||||
or in registers. The jprobe will work in either case, so long as the
|
||||
handler's prototype matches that of the probed function.
|
||||
|
||||
Note that in some architectures (e.g.: arm64 and sparc64) the stack
|
||||
copy is not done, as the actual location of stacked parameters may be
|
||||
outside of a reasonable MAX_STACK_SIZE value and because that location
|
||||
cannot be determined by the jprobes code. In this case the jprobes
|
||||
user must be careful to make certain the calling signature of the
|
||||
function does not cause parameters to be passed on the stack (e.g.:
|
||||
more than eight function arguments, an argument of more than sixteen
|
||||
bytes, or more than 64 bytes of argument data, depending on
|
||||
architecture).
|
||||
|
||||
Return Probes
|
||||
-------------
|
||||
|
||||
@@ -245,8 +204,7 @@ Pre-optimization
|
||||
After preparing the detour buffer, Kprobes verifies that none of the
|
||||
following situations exist:
|
||||
|
||||
- The probe has either a break_handler (i.e., it's a jprobe) or a
|
||||
post_handler.
|
||||
- The probe has a post_handler.
|
||||
- Other instructions in the optimized region are probed.
|
||||
- The probe is disabled.
|
||||
|
||||
@@ -331,7 +289,7 @@ rejects registering it, if the given address is in the blacklist.
|
||||
Architectures Supported
|
||||
=======================
|
||||
|
||||
Kprobes, jprobes, and return probes are implemented on the following
|
||||
Kprobes and return probes are implemented on the following
|
||||
architectures:
|
||||
|
||||
- i386 (Supports jump optimization)
|
||||
@@ -446,27 +404,6 @@ architecture-specific trap number associated with the fault (e.g.,
|
||||
on i386, 13 for a general protection fault or 14 for a page fault).
|
||||
Returns 1 if it successfully handled the exception.
|
||||
|
||||
register_jprobe
|
||||
---------------
|
||||
|
||||
::
|
||||
|
||||
#include <linux/kprobes.h>
|
||||
int register_jprobe(struct jprobe *jp)
|
||||
|
||||
Sets a breakpoint at the address jp->kp.addr, which must be the address
|
||||
of the first instruction of a function. When the breakpoint is hit,
|
||||
Kprobes runs the handler whose address is jp->entry.
|
||||
|
||||
The handler should have the same arg list and return type as the probed
|
||||
function; and just before it returns, it must call jprobe_return().
|
||||
(The handler never actually returns, since jprobe_return() returns
|
||||
control to Kprobes.) If the probed function is declared asmlinkage
|
||||
or anything else that affects how args are passed, the handler's
|
||||
declaration must match.
|
||||
|
||||
register_jprobe() returns 0 on success, or a negative errno otherwise.
|
||||
|
||||
register_kretprobe
|
||||
------------------
|
||||
|
||||
@@ -513,7 +450,6 @@ unregister_*probe
|
||||
|
||||
#include <linux/kprobes.h>
|
||||
void unregister_kprobe(struct kprobe *kp);
|
||||
void unregister_jprobe(struct jprobe *jp);
|
||||
void unregister_kretprobe(struct kretprobe *rp);
|
||||
|
||||
Removes the specified probe. The unregister function can be called
|
||||
@@ -532,7 +468,6 @@ register_*probes
|
||||
#include <linux/kprobes.h>
|
||||
int register_kprobes(struct kprobe **kps, int num);
|
||||
int register_kretprobes(struct kretprobe **rps, int num);
|
||||
int register_jprobes(struct jprobe **jps, int num);
|
||||
|
||||
Registers each of the num probes in the specified array. If any
|
||||
error occurs during registration, all probes in the array, up to
|
||||
@@ -555,7 +490,6 @@ unregister_*probes
|
||||
#include <linux/kprobes.h>
|
||||
void unregister_kprobes(struct kprobe **kps, int num);
|
||||
void unregister_kretprobes(struct kretprobe **rps, int num);
|
||||
void unregister_jprobes(struct jprobe **jps, int num);
|
||||
|
||||
Removes each of the num probes in the specified array at once.
|
||||
|
||||
@@ -574,7 +508,6 @@ disable_*probe
|
||||
#include <linux/kprobes.h>
|
||||
int disable_kprobe(struct kprobe *kp);
|
||||
int disable_kretprobe(struct kretprobe *rp);
|
||||
int disable_jprobe(struct jprobe *jp);
|
||||
|
||||
Temporarily disables the specified ``*probe``. You can enable it again by using
|
||||
enable_*probe(). You must specify the probe which has been registered.
|
||||
@@ -587,7 +520,6 @@ enable_*probe
|
||||
#include <linux/kprobes.h>
|
||||
int enable_kprobe(struct kprobe *kp);
|
||||
int enable_kretprobe(struct kretprobe *rp);
|
||||
int enable_jprobe(struct jprobe *jp);
|
||||
|
||||
Enables ``*probe`` which has been disabled by disable_*probe(). You must specify
|
||||
the probe which has been registered.
|
||||
@@ -595,12 +527,10 @@ the probe which has been registered.
|
||||
Kprobes Features and Limitations
|
||||
================================
|
||||
|
||||
Kprobes allows multiple probes at the same address. Currently,
|
||||
however, there cannot be multiple jprobes on the same function at
|
||||
the same time. Also, a probepoint for which there is a jprobe or
|
||||
a post_handler cannot be optimized. So if you install a jprobe,
|
||||
or a kprobe with a post_handler, at an optimized probepoint, the
|
||||
probepoint will be unoptimized automatically.
|
||||
Kprobes allows multiple probes at the same address. Also,
|
||||
a probepoint for which there is a post_handler cannot be optimized.
|
||||
So if you install a kprobe with a post_handler, at an optimized
|
||||
probepoint, the probepoint will be unoptimized automatically.
|
||||
|
||||
In general, you can install a probe anywhere in the kernel.
|
||||
In particular, you can probe interrupt handlers. Known exceptions
|
||||
@@ -662,7 +592,7 @@ We're unaware of other specific cases where this could be a problem.
|
||||
If, upon entry to or exit from a function, the CPU is running on
|
||||
a stack other than that of the current task, registering a return
|
||||
probe on that function may produce undesirable results. For this
|
||||
reason, Kprobes doesn't support return probes (or kprobes or jprobes)
|
||||
reason, Kprobes doesn't support return probes (or kprobes)
|
||||
on the x86_64 version of __switch_to(); the registration functions
|
||||
return -EINVAL.
|
||||
|
||||
@@ -706,24 +636,24 @@ Probe Overhead
|
||||
On a typical CPU in use in 2005, a kprobe hit takes 0.5 to 1.0
|
||||
microseconds to process. Specifically, a benchmark that hits the same
|
||||
probepoint repeatedly, firing a simple handler each time, reports 1-2
|
||||
million hits per second, depending on the architecture. A jprobe or
|
||||
return-probe hit typically takes 50-75% longer than a kprobe hit.
|
||||
million hits per second, depending on the architecture. A return-probe
|
||||
hit typically takes 50-75% longer than a kprobe hit.
|
||||
When you have a return probe set on a function, adding a kprobe at
|
||||
the entry to that function adds essentially no overhead.
|
||||
|
||||
Here are sample overhead figures (in usec) for different architectures::
|
||||
|
||||
k = kprobe; j = jprobe; r = return probe; kr = kprobe + return probe
|
||||
on same function; jr = jprobe + return probe on same function::
|
||||
k = kprobe; r = return probe; kr = kprobe + return probe
|
||||
on same function
|
||||
|
||||
i386: Intel Pentium M, 1495 MHz, 2957.31 bogomips
|
||||
k = 0.57 usec; j = 1.00; r = 0.92; kr = 0.99; jr = 1.40
|
||||
k = 0.57 usec; r = 0.92; kr = 0.99
|
||||
|
||||
x86_64: AMD Opteron 246, 1994 MHz, 3971.48 bogomips
|
||||
k = 0.49 usec; j = 0.76; r = 0.80; kr = 0.82; jr = 1.07
|
||||
k = 0.49 usec; r = 0.80; kr = 0.82
|
||||
|
||||
ppc64: POWER5 (gr), 1656 MHz (SMT disabled, 1 virtual CPU per physical CPU)
|
||||
k = 0.77 usec; j = 1.31; r = 1.26; kr = 1.45; jr = 1.99
|
||||
k = 0.77 usec; r = 1.26; kr = 1.45
|
||||
|
||||
Optimized Probe Overhead
|
||||
------------------------
|
||||
@@ -755,11 +685,6 @@ Kprobes Example
|
||||
|
||||
See samples/kprobes/kprobe_example.c
|
||||
|
||||
Jprobes Example
|
||||
===============
|
||||
|
||||
See samples/kprobes/jprobe_example.c
|
||||
|
||||
Kretprobes Example
|
||||
==================
|
||||
|
||||
@@ -772,6 +697,37 @@ For additional information on Kprobes, refer to the following URLs:
|
||||
- http://www-users.cs.umn.edu/~boutcher/kprobes/
|
||||
- http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
|
||||
|
||||
Deprecated Features
|
||||
===================
|
||||
|
||||
Jprobes is now a deprecated feature. People who are depending on it should
|
||||
migrate to other tracing features or use older kernels. Please consider to
|
||||
migrate your tool to one of the following options:
|
||||
|
||||
- Use trace-event to trace target function with arguments.
|
||||
|
||||
trace-event is a low-overhead (and almost no visible overhead if it
|
||||
is off) statically defined event interface. You can define new events
|
||||
and trace it via ftrace or any other tracing tools.
|
||||
|
||||
See the following urls:
|
||||
|
||||
- https://lwn.net/Articles/379903/
|
||||
- https://lwn.net/Articles/381064/
|
||||
- https://lwn.net/Articles/383362/
|
||||
|
||||
- Use ftrace dynamic events (kprobe event) with perf-probe.
|
||||
|
||||
If you build your kernel with debug info (CONFIG_DEBUG_INFO=y), you can
|
||||
find which register/stack is assigned to which local variable or arguments
|
||||
by using perf-probe and set up new event to trace it.
|
||||
|
||||
See following documents:
|
||||
|
||||
- Documentation/trace/kprobetrace.txt
|
||||
- Documentation/trace/events.txt
|
||||
- tools/perf/Documentation/perf-probe.txt
|
||||
|
||||
|
||||
The kprobes debugfs interface
|
||||
=============================
|
||||
@@ -783,14 +739,13 @@ under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at /
|
||||
/sys/kernel/debug/kprobes/list: Lists all registered probes on the system::
|
||||
|
||||
c015d71a k vfs_read+0x0
|
||||
c011a316 j do_fork+0x0
|
||||
c03dedc5 r tcp_v4_rcv+0x0
|
||||
|
||||
The first column provides the kernel address where the probe is inserted.
|
||||
The second column identifies the type of probe (k - kprobe, r - kretprobe
|
||||
and j - jprobe), while the third column specifies the symbol+offset of
|
||||
the probe. If the probed function belongs to a module, the module name
|
||||
is also specified. Following columns show probe status. If the probe is on
|
||||
The second column identifies the type of probe (k - kprobe and r - kretprobe)
|
||||
while the third column specifies the symbol+offset of the probe.
|
||||
If the probed function belongs to a module, the module name is also
|
||||
specified. Following columns show probe status. If the probe is on
|
||||
a virtual address that is no longer valid (module init sections, module
|
||||
virtual addresses that correspond to modules that've been unloaded),
|
||||
such probes are marked with [GONE]. If the probe is temporarily disabled,
|
||||
|
||||
+1
-1
@@ -91,7 +91,7 @@ config STATIC_KEYS_SELFTEST
|
||||
config OPTPROBES
|
||||
def_bool y
|
||||
depends on KPROBES && HAVE_OPTPROBES
|
||||
depends on !PREEMPT
|
||||
select TASKS_RCU if PREEMPT
|
||||
|
||||
config KPROBES_ON_FTRACE
|
||||
def_bool y
|
||||
|
||||
@@ -227,7 +227,6 @@ static bool test_regs_ok;
|
||||
static int test_func_instance;
|
||||
static int pre_handler_called;
|
||||
static int post_handler_called;
|
||||
static int jprobe_func_called;
|
||||
static int kretprobe_handler_called;
|
||||
static int tests_failed;
|
||||
|
||||
@@ -370,50 +369,6 @@ static int test_kprobe(long (*func)(long, long))
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void __kprobes jprobe_func(long r0, long r1)
|
||||
{
|
||||
jprobe_func_called = test_func_instance;
|
||||
if (r0 == FUNC_ARG1 && r1 == FUNC_ARG2)
|
||||
test_regs_ok = true;
|
||||
jprobe_return();
|
||||
}
|
||||
|
||||
static struct jprobe the_jprobe = {
|
||||
.entry = jprobe_func,
|
||||
};
|
||||
|
||||
static int test_jprobe(long (*func)(long, long))
|
||||
{
|
||||
int ret;
|
||||
|
||||
the_jprobe.kp.addr = (kprobe_opcode_t *)func;
|
||||
ret = register_jprobe(&the_jprobe);
|
||||
if (ret < 0) {
|
||||
pr_err("FAIL: register_jprobe failed with %d\n", ret);
|
||||
return ret;
|
||||
}
|
||||
|
||||
ret = call_test_func(func, true);
|
||||
|
||||
unregister_jprobe(&the_jprobe);
|
||||
the_jprobe.kp.flags = 0; /* Clear disable flag to allow reuse */
|
||||
|
||||
if (!ret)
|
||||
return -EINVAL;
|
||||
if (jprobe_func_called != test_func_instance) {
|
||||
pr_err("FAIL: jprobe handler function not called\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
if (!call_test_func(func, false))
|
||||
return -EINVAL;
|
||||
if (jprobe_func_called == test_func_instance) {
|
||||
pr_err("FAIL: probe called after unregistering\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int __kprobes
|
||||
kretprobe_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
|
||||
{
|
||||
@@ -451,7 +406,7 @@ static int test_kretprobe(long (*func)(long, long))
|
||||
}
|
||||
if (!call_test_func(func, false))
|
||||
return -EINVAL;
|
||||
if (jprobe_func_called == test_func_instance) {
|
||||
if (kretprobe_handler_called == test_func_instance) {
|
||||
pr_err("FAIL: kretprobe called after unregistering\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
@@ -468,18 +423,6 @@ static int run_api_tests(long (*func)(long, long))
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
pr_info(" jprobe\n");
|
||||
ret = test_jprobe(func);
|
||||
#if defined(CONFIG_THUMB2_KERNEL) && !defined(MODULE)
|
||||
if (ret == -EINVAL) {
|
||||
pr_err("FAIL: Known longtime bug with jprobe on Thumb kernels\n");
|
||||
tests_failed = ret;
|
||||
ret = 0;
|
||||
}
|
||||
#endif
|
||||
if (ret < 0)
|
||||
return ret;
|
||||
|
||||
pr_info(" kretprobe\n");
|
||||
ret = test_kretprobe(func);
|
||||
if (ret < 0)
|
||||
|
||||
@@ -2958,6 +2958,10 @@ static unsigned long intel_pmu_free_running_flags(struct perf_event *event)
|
||||
|
||||
if (event->attr.use_clockid)
|
||||
flags &= ~PERF_SAMPLE_TIME;
|
||||
if (!event->attr.exclude_kernel)
|
||||
flags &= ~PERF_SAMPLE_REGS_USER;
|
||||
if (event->attr.sample_regs_user & ~PEBS_REGS)
|
||||
flags &= ~(PERF_SAMPLE_REGS_USER | PERF_SAMPLE_REGS_INTR);
|
||||
return flags;
|
||||
}
|
||||
|
||||
|
||||
@@ -85,13 +85,15 @@ struct amd_nb {
|
||||
* Flags PEBS can handle without an PMI.
|
||||
*
|
||||
* TID can only be handled by flushing at context switch.
|
||||
* REGS_USER can be handled for events limited to ring 3.
|
||||
*
|
||||
*/
|
||||
#define PEBS_FREERUNNING_FLAGS \
|
||||
(PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR | \
|
||||
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID | \
|
||||
PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER | \
|
||||
PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR)
|
||||
PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR | \
|
||||
PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER)
|
||||
|
||||
/*
|
||||
* A debug store configuration.
|
||||
@@ -110,6 +112,26 @@ struct debug_store {
|
||||
u64 pebs_event_reset[MAX_PEBS_EVENTS];
|
||||
};
|
||||
|
||||
#define PEBS_REGS \
|
||||
(PERF_REG_X86_AX | \
|
||||
PERF_REG_X86_BX | \
|
||||
PERF_REG_X86_CX | \
|
||||
PERF_REG_X86_DX | \
|
||||
PERF_REG_X86_DI | \
|
||||
PERF_REG_X86_SI | \
|
||||
PERF_REG_X86_SP | \
|
||||
PERF_REG_X86_BP | \
|
||||
PERF_REG_X86_IP | \
|
||||
PERF_REG_X86_FLAGS | \
|
||||
PERF_REG_X86_R8 | \
|
||||
PERF_REG_X86_R9 | \
|
||||
PERF_REG_X86_R10 | \
|
||||
PERF_REG_X86_R11 | \
|
||||
PERF_REG_X86_R12 | \
|
||||
PERF_REG_X86_R13 | \
|
||||
PERF_REG_X86_R14 | \
|
||||
PERF_REG_X86_R15)
|
||||
|
||||
/*
|
||||
* Per register state.
|
||||
*/
|
||||
|
||||
@@ -58,8 +58,8 @@ extern __visible kprobe_opcode_t optprobe_template_call[];
|
||||
extern __visible kprobe_opcode_t optprobe_template_end[];
|
||||
#define MAX_OPTIMIZED_LENGTH (MAX_INSN_SIZE + RELATIVE_ADDR_SIZE)
|
||||
#define MAX_OPTINSN_SIZE \
|
||||
(((unsigned long)&optprobe_template_end - \
|
||||
(unsigned long)&optprobe_template_entry) + \
|
||||
(((unsigned long)optprobe_template_end - \
|
||||
(unsigned long)optprobe_template_entry) + \
|
||||
MAX_OPTIMIZED_LENGTH + RELATIVEJUMP_SIZE)
|
||||
|
||||
extern const int kretprobe_blacklist_size;
|
||||
|
||||
@@ -85,11 +85,11 @@ extern unsigned long recover_probed_instruction(kprobe_opcode_t *buf,
|
||||
* Copy an instruction and adjust the displacement if the instruction
|
||||
* uses the %rip-relative addressing mode.
|
||||
*/
|
||||
extern int __copy_instruction(u8 *dest, u8 *src, struct insn *insn);
|
||||
extern int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn);
|
||||
|
||||
/* Generate a relative-jump/call instruction */
|
||||
extern void synthesize_reljump(void *from, void *to);
|
||||
extern void synthesize_relcall(void *from, void *to);
|
||||
extern void synthesize_reljump(void *dest, void *from, void *to);
|
||||
extern void synthesize_relcall(void *dest, void *from, void *to);
|
||||
|
||||
#ifdef CONFIG_OPTPROBES
|
||||
extern int setup_detour_execution(struct kprobe *p, struct pt_regs *regs, int reenter);
|
||||
|
||||
@@ -119,29 +119,29 @@ struct kretprobe_blackpoint kretprobe_blacklist[] = {
|
||||
const int kretprobe_blacklist_size = ARRAY_SIZE(kretprobe_blacklist);
|
||||
|
||||
static nokprobe_inline void
|
||||
__synthesize_relative_insn(void *from, void *to, u8 op)
|
||||
__synthesize_relative_insn(void *dest, void *from, void *to, u8 op)
|
||||
{
|
||||
struct __arch_relative_insn {
|
||||
u8 op;
|
||||
s32 raddr;
|
||||
} __packed *insn;
|
||||
|
||||
insn = (struct __arch_relative_insn *)from;
|
||||
insn = (struct __arch_relative_insn *)dest;
|
||||
insn->raddr = (s32)((long)(to) - ((long)(from) + 5));
|
||||
insn->op = op;
|
||||
}
|
||||
|
||||
/* Insert a jump instruction at address 'from', which jumps to address 'to'.*/
|
||||
void synthesize_reljump(void *from, void *to)
|
||||
void synthesize_reljump(void *dest, void *from, void *to)
|
||||
{
|
||||
__synthesize_relative_insn(from, to, RELATIVEJUMP_OPCODE);
|
||||
__synthesize_relative_insn(dest, from, to, RELATIVEJUMP_OPCODE);
|
||||
}
|
||||
NOKPROBE_SYMBOL(synthesize_reljump);
|
||||
|
||||
/* Insert a call instruction at address 'from', which calls address 'to'.*/
|
||||
void synthesize_relcall(void *from, void *to)
|
||||
void synthesize_relcall(void *dest, void *from, void *to)
|
||||
{
|
||||
__synthesize_relative_insn(from, to, RELATIVECALL_OPCODE);
|
||||
__synthesize_relative_insn(dest, from, to, RELATIVECALL_OPCODE);
|
||||
}
|
||||
NOKPROBE_SYMBOL(synthesize_relcall);
|
||||
|
||||
@@ -346,10 +346,11 @@ static int is_IF_modifier(kprobe_opcode_t *insn)
|
||||
/*
|
||||
* Copy an instruction with recovering modified instruction by kprobes
|
||||
* and adjust the displacement if the instruction uses the %rip-relative
|
||||
* addressing mode.
|
||||
* addressing mode. Note that since @real will be the final place of copied
|
||||
* instruction, displacement must be adjust by @real, not @dest.
|
||||
* This returns the length of copied instruction, or 0 if it has an error.
|
||||
*/
|
||||
int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
|
||||
int __copy_instruction(u8 *dest, u8 *src, u8 *real, struct insn *insn)
|
||||
{
|
||||
kprobe_opcode_t buf[MAX_INSN_SIZE];
|
||||
unsigned long recovered_insn =
|
||||
@@ -387,11 +388,11 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
|
||||
* have given.
|
||||
*/
|
||||
newdisp = (u8 *) src + (s64) insn->displacement.value
|
||||
- (u8 *) dest;
|
||||
- (u8 *) real;
|
||||
if ((s64) (s32) newdisp != newdisp) {
|
||||
pr_err("Kprobes error: new displacement does not fit into s32 (%llx)\n", newdisp);
|
||||
pr_err("\tSrc: %p, Dest: %p, old disp: %x\n",
|
||||
src, dest, insn->displacement.value);
|
||||
src, real, insn->displacement.value);
|
||||
return 0;
|
||||
}
|
||||
disp = (u8 *) dest + insn_offset_displacement(insn);
|
||||
@@ -402,20 +403,38 @@ int __copy_instruction(u8 *dest, u8 *src, struct insn *insn)
|
||||
}
|
||||
|
||||
/* Prepare reljump right after instruction to boost */
|
||||
static void prepare_boost(struct kprobe *p, struct insn *insn)
|
||||
static int prepare_boost(kprobe_opcode_t *buf, struct kprobe *p,
|
||||
struct insn *insn)
|
||||
{
|
||||
int len = insn->length;
|
||||
|
||||
if (can_boost(insn, p->addr) &&
|
||||
MAX_INSN_SIZE - insn->length >= RELATIVEJUMP_SIZE) {
|
||||
MAX_INSN_SIZE - len >= RELATIVEJUMP_SIZE) {
|
||||
/*
|
||||
* These instructions can be executed directly if it
|
||||
* jumps back to correct address.
|
||||
*/
|
||||
synthesize_reljump(p->ainsn.insn + insn->length,
|
||||
synthesize_reljump(buf + len, p->ainsn.insn + len,
|
||||
p->addr + insn->length);
|
||||
len += RELATIVEJUMP_SIZE;
|
||||
p->ainsn.boostable = true;
|
||||
} else {
|
||||
p->ainsn.boostable = false;
|
||||
}
|
||||
|
||||
return len;
|
||||
}
|
||||
|
||||
/* Make page to RO mode when allocate it */
|
||||
void *alloc_insn_page(void)
|
||||
{
|
||||
void *page;
|
||||
|
||||
page = module_alloc(PAGE_SIZE);
|
||||
if (page)
|
||||
set_memory_ro((unsigned long)page & PAGE_MASK, 1);
|
||||
|
||||
return page;
|
||||
}
|
||||
|
||||
/* Recover page to RW mode before releasing it */
|
||||
@@ -429,12 +448,11 @@ void free_insn_page(void *page)
|
||||
static int arch_copy_kprobe(struct kprobe *p)
|
||||
{
|
||||
struct insn insn;
|
||||
kprobe_opcode_t buf[MAX_INSN_SIZE];
|
||||
int len;
|
||||
|
||||
set_memory_rw((unsigned long)p->ainsn.insn & PAGE_MASK, 1);
|
||||
|
||||
/* Copy an instruction with recovering if other optprobe modifies it.*/
|
||||
len = __copy_instruction(p->ainsn.insn, p->addr, &insn);
|
||||
len = __copy_instruction(buf, p->addr, p->ainsn.insn, &insn);
|
||||
if (!len)
|
||||
return -EINVAL;
|
||||
|
||||
@@ -442,15 +460,16 @@ static int arch_copy_kprobe(struct kprobe *p)
|
||||
* __copy_instruction can modify the displacement of the instruction,
|
||||
* but it doesn't affect boostable check.
|
||||
*/
|
||||
prepare_boost(p, &insn);
|
||||
|
||||
set_memory_ro((unsigned long)p->ainsn.insn & PAGE_MASK, 1);
|
||||
len = prepare_boost(buf, p, &insn);
|
||||
|
||||
/* Check whether the instruction modifies Interrupt Flag or not */
|
||||
p->ainsn.if_modifier = is_IF_modifier(p->ainsn.insn);
|
||||
p->ainsn.if_modifier = is_IF_modifier(buf);
|
||||
|
||||
/* Also, displacement change doesn't affect the first byte */
|
||||
p->opcode = p->ainsn.insn[0];
|
||||
p->opcode = buf[0];
|
||||
|
||||
/* OK, write back the instruction(s) into ROX insn buffer */
|
||||
text_poke(p->ainsn.insn, buf, len);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -26,7 +26,7 @@
|
||||
#include "common.h"
|
||||
|
||||
static nokprobe_inline
|
||||
int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
|
||||
void __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
|
||||
struct kprobe_ctlblk *kcb, unsigned long orig_ip)
|
||||
{
|
||||
/*
|
||||
@@ -41,33 +41,31 @@ int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
|
||||
__this_cpu_write(current_kprobe, NULL);
|
||||
if (orig_ip)
|
||||
regs->ip = orig_ip;
|
||||
return 1;
|
||||
}
|
||||
|
||||
int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
|
||||
struct kprobe_ctlblk *kcb)
|
||||
{
|
||||
if (kprobe_ftrace(p))
|
||||
return __skip_singlestep(p, regs, kcb, 0);
|
||||
else
|
||||
return 0;
|
||||
if (kprobe_ftrace(p)) {
|
||||
__skip_singlestep(p, regs, kcb, 0);
|
||||
preempt_enable_no_resched();
|
||||
return 1;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
NOKPROBE_SYMBOL(skip_singlestep);
|
||||
|
||||
/* Ftrace callback handler for kprobes */
|
||||
/* Ftrace callback handler for kprobes -- called under preepmt disabed */
|
||||
void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
|
||||
struct ftrace_ops *ops, struct pt_regs *regs)
|
||||
{
|
||||
struct kprobe *p;
|
||||
struct kprobe_ctlblk *kcb;
|
||||
unsigned long flags;
|
||||
|
||||
/* Disable irq for emulating a breakpoint and avoiding preempt */
|
||||
local_irq_save(flags);
|
||||
|
||||
/* Preempt is disabled by ftrace */
|
||||
p = get_kprobe((kprobe_opcode_t *)ip);
|
||||
if (unlikely(!p) || kprobe_disabled(p))
|
||||
goto end;
|
||||
return;
|
||||
|
||||
kcb = get_kprobe_ctlblk();
|
||||
if (kprobe_running()) {
|
||||
@@ -77,17 +75,19 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
|
||||
/* Kprobe handler expects regs->ip = ip + 1 as breakpoint hit */
|
||||
regs->ip = ip + sizeof(kprobe_opcode_t);
|
||||
|
||||
/* To emulate trap based kprobes, preempt_disable here */
|
||||
preempt_disable();
|
||||
__this_cpu_write(current_kprobe, p);
|
||||
kcb->kprobe_status = KPROBE_HIT_ACTIVE;
|
||||
if (!p->pre_handler || !p->pre_handler(p, regs))
|
||||
if (!p->pre_handler || !p->pre_handler(p, regs)) {
|
||||
__skip_singlestep(p, regs, kcb, orig_ip);
|
||||
preempt_enable_no_resched();
|
||||
}
|
||||
/*
|
||||
* If pre_handler returns !0, it sets regs->ip and
|
||||
* resets current kprobe.
|
||||
* resets current kprobe, and keep preempt count +1.
|
||||
*/
|
||||
}
|
||||
end:
|
||||
local_irq_restore(flags);
|
||||
}
|
||||
NOKPROBE_SYMBOL(kprobe_ftrace_handler);
|
||||
|
||||
|
||||
@@ -142,11 +142,11 @@ void optprobe_template_func(void);
|
||||
STACK_FRAME_NON_STANDARD(optprobe_template_func);
|
||||
|
||||
#define TMPL_MOVE_IDX \
|
||||
((long)&optprobe_template_val - (long)&optprobe_template_entry)
|
||||
((long)optprobe_template_val - (long)optprobe_template_entry)
|
||||
#define TMPL_CALL_IDX \
|
||||
((long)&optprobe_template_call - (long)&optprobe_template_entry)
|
||||
((long)optprobe_template_call - (long)optprobe_template_entry)
|
||||
#define TMPL_END_IDX \
|
||||
((long)&optprobe_template_end - (long)&optprobe_template_entry)
|
||||
((long)optprobe_template_end - (long)optprobe_template_entry)
|
||||
|
||||
#define INT3_SIZE sizeof(kprobe_opcode_t)
|
||||
|
||||
@@ -154,17 +154,15 @@ STACK_FRAME_NON_STANDARD(optprobe_template_func);
|
||||
static void
|
||||
optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
|
||||
{
|
||||
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
|
||||
unsigned long flags;
|
||||
|
||||
/* This is possible if op is under delayed unoptimizing */
|
||||
if (kprobe_disabled(&op->kp))
|
||||
return;
|
||||
|
||||
local_irq_save(flags);
|
||||
preempt_disable();
|
||||
if (kprobe_running()) {
|
||||
kprobes_inc_nmissed_count(&op->kp);
|
||||
} else {
|
||||
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
|
||||
/* Save skipped registers */
|
||||
#ifdef CONFIG_X86_64
|
||||
regs->cs = __KERNEL_CS;
|
||||
@@ -180,17 +178,17 @@ optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs)
|
||||
opt_pre_handler(&op->kp, regs);
|
||||
__this_cpu_write(current_kprobe, NULL);
|
||||
}
|
||||
local_irq_restore(flags);
|
||||
preempt_enable_no_resched();
|
||||
}
|
||||
NOKPROBE_SYMBOL(optimized_callback);
|
||||
|
||||
static int copy_optimized_instructions(u8 *dest, u8 *src)
|
||||
static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
|
||||
{
|
||||
struct insn insn;
|
||||
int len = 0, ret;
|
||||
|
||||
while (len < RELATIVEJUMP_SIZE) {
|
||||
ret = __copy_instruction(dest + len, src + len, &insn);
|
||||
ret = __copy_instruction(dest + len, src + len, real, &insn);
|
||||
if (!ret || !can_boost(&insn, src + len))
|
||||
return -EINVAL;
|
||||
len += ret;
|
||||
@@ -343,57 +341,66 @@ void arch_remove_optimized_kprobe(struct optimized_kprobe *op)
|
||||
int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
|
||||
struct kprobe *__unused)
|
||||
{
|
||||
u8 *buf;
|
||||
int ret;
|
||||
u8 *buf = NULL, *slot;
|
||||
int ret, len;
|
||||
long rel;
|
||||
|
||||
if (!can_optimize((unsigned long)op->kp.addr))
|
||||
return -EILSEQ;
|
||||
|
||||
op->optinsn.insn = get_optinsn_slot();
|
||||
if (!op->optinsn.insn)
|
||||
buf = kzalloc(MAX_OPTINSN_SIZE, GFP_KERNEL);
|
||||
if (!buf)
|
||||
return -ENOMEM;
|
||||
|
||||
op->optinsn.insn = slot = get_optinsn_slot();
|
||||
if (!slot) {
|
||||
ret = -ENOMEM;
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* Verify if the address gap is in 2GB range, because this uses
|
||||
* a relative jump.
|
||||
*/
|
||||
rel = (long)op->optinsn.insn - (long)op->kp.addr + RELATIVEJUMP_SIZE;
|
||||
rel = (long)slot - (long)op->kp.addr + RELATIVEJUMP_SIZE;
|
||||
if (abs(rel) > 0x7fffffff) {
|
||||
__arch_remove_optimized_kprobe(op, 0);
|
||||
return -ERANGE;
|
||||
ret = -ERANGE;
|
||||
goto err;
|
||||
}
|
||||
|
||||
buf = (u8 *)op->optinsn.insn;
|
||||
set_memory_rw((unsigned long)buf & PAGE_MASK, 1);
|
||||
|
||||
/* Copy instructions into the out-of-line buffer */
|
||||
ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr);
|
||||
if (ret < 0) {
|
||||
__arch_remove_optimized_kprobe(op, 0);
|
||||
return ret;
|
||||
}
|
||||
op->optinsn.size = ret;
|
||||
|
||||
/* Copy arch-dep-instance from template */
|
||||
memcpy(buf, &optprobe_template_entry, TMPL_END_IDX);
|
||||
memcpy(buf, optprobe_template_entry, TMPL_END_IDX);
|
||||
|
||||
/* Copy instructions into the out-of-line buffer */
|
||||
ret = copy_optimized_instructions(buf + TMPL_END_IDX, op->kp.addr,
|
||||
slot + TMPL_END_IDX);
|
||||
if (ret < 0)
|
||||
goto err;
|
||||
op->optinsn.size = ret;
|
||||
len = TMPL_END_IDX + op->optinsn.size;
|
||||
|
||||
/* Set probe information */
|
||||
synthesize_set_arg1(buf + TMPL_MOVE_IDX, (unsigned long)op);
|
||||
|
||||
/* Set probe function call */
|
||||
synthesize_relcall(buf + TMPL_CALL_IDX, optimized_callback);
|
||||
synthesize_relcall(buf + TMPL_CALL_IDX,
|
||||
slot + TMPL_CALL_IDX, optimized_callback);
|
||||
|
||||
/* Set returning jmp instruction at the tail of out-of-line buffer */
|
||||
synthesize_reljump(buf + TMPL_END_IDX + op->optinsn.size,
|
||||
synthesize_reljump(buf + len, slot + len,
|
||||
(u8 *)op->kp.addr + op->optinsn.size);
|
||||
len += RELATIVEJUMP_SIZE;
|
||||
|
||||
set_memory_ro((unsigned long)buf & PAGE_MASK, 1);
|
||||
/* We have to use text_poke for instuction buffer because it is RO */
|
||||
text_poke(slot, buf, len);
|
||||
ret = 0;
|
||||
out:
|
||||
kfree(buf);
|
||||
return ret;
|
||||
|
||||
flush_icache_range((unsigned long) buf,
|
||||
(unsigned long) buf + TMPL_END_IDX +
|
||||
op->optinsn.size + RELATIVEJUMP_SIZE);
|
||||
return 0;
|
||||
err:
|
||||
__arch_remove_optimized_kprobe(op, 0);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
|
||||
+45
-109
@@ -56,122 +56,54 @@ static ssize_t direct_entry(struct file *f, const char __user *user_buf,
|
||||
size_t count, loff_t *off);
|
||||
|
||||
#ifdef CONFIG_KPROBES
|
||||
static void lkdtm_handler(void);
|
||||
static int lkdtm_kprobe_handler(struct kprobe *kp, struct pt_regs *regs);
|
||||
static ssize_t lkdtm_debugfs_entry(struct file *f,
|
||||
const char __user *user_buf,
|
||||
size_t count, loff_t *off);
|
||||
|
||||
|
||||
/* jprobe entry point handlers. */
|
||||
static unsigned int jp_do_irq(unsigned int irq)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static irqreturn_t jp_handle_irq_event(unsigned int irq,
|
||||
struct irqaction *action)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void jp_tasklet_action(struct softirq_action *a)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
}
|
||||
|
||||
static void jp_ll_rw_block(int rw, int nr, struct buffer_head *bhs[])
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
}
|
||||
|
||||
struct scan_control;
|
||||
|
||||
static unsigned long jp_shrink_inactive_list(unsigned long max_scan,
|
||||
struct zone *zone,
|
||||
struct scan_control *sc)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int jp_hrtimer_start(struct hrtimer *timer, ktime_t tim,
|
||||
const enum hrtimer_mode mode)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int jp_scsi_dispatch_cmd(struct scsi_cmnd *cmd)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
# ifdef CONFIG_IDE
|
||||
static int jp_generic_ide_ioctl(ide_drive_t *drive, struct file *file,
|
||||
struct block_device *bdev, unsigned int cmd,
|
||||
unsigned long arg)
|
||||
{
|
||||
lkdtm_handler();
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
# endif
|
||||
# define CRASHPOINT_KPROBE(_symbol) \
|
||||
.kprobe = { \
|
||||
.symbol_name = (_symbol), \
|
||||
.pre_handler = lkdtm_kprobe_handler, \
|
||||
},
|
||||
# define CRASHPOINT_WRITE(_symbol) \
|
||||
(_symbol) ? lkdtm_debugfs_entry : direct_entry
|
||||
#else
|
||||
# define CRASHPOINT_KPROBE(_symbol)
|
||||
# define CRASHPOINT_WRITE(_symbol) direct_entry
|
||||
#endif
|
||||
|
||||
/* Crash points */
|
||||
struct crashpoint {
|
||||
const char *name;
|
||||
const struct file_operations fops;
|
||||
struct jprobe jprobe;
|
||||
struct kprobe kprobe;
|
||||
};
|
||||
|
||||
#define CRASHPOINT(_name, _write, _symbol, _entry) \
|
||||
#define CRASHPOINT(_name, _symbol) \
|
||||
{ \
|
||||
.name = _name, \
|
||||
.fops = { \
|
||||
.read = lkdtm_debugfs_read, \
|
||||
.llseek = generic_file_llseek, \
|
||||
.open = lkdtm_debugfs_open, \
|
||||
.write = _write, \
|
||||
}, \
|
||||
.jprobe = { \
|
||||
.kp.symbol_name = _symbol, \
|
||||
.entry = (kprobe_opcode_t *)_entry, \
|
||||
.write = CRASHPOINT_WRITE(_symbol) \
|
||||
}, \
|
||||
CRASHPOINT_KPROBE(_symbol) \
|
||||
}
|
||||
|
||||
/* Define the possible places where we can trigger a crash point. */
|
||||
struct crashpoint crashpoints[] = {
|
||||
CRASHPOINT("DIRECT", direct_entry,
|
||||
NULL, NULL),
|
||||
static struct crashpoint crashpoints[] = {
|
||||
CRASHPOINT("DIRECT", NULL),
|
||||
#ifdef CONFIG_KPROBES
|
||||
CRASHPOINT("INT_HARDWARE_ENTRY", lkdtm_debugfs_entry,
|
||||
"do_IRQ", jp_do_irq),
|
||||
CRASHPOINT("INT_HW_IRQ_EN", lkdtm_debugfs_entry,
|
||||
"handle_IRQ_event", jp_handle_irq_event),
|
||||
CRASHPOINT("INT_TASKLET_ENTRY", lkdtm_debugfs_entry,
|
||||
"tasklet_action", jp_tasklet_action),
|
||||
CRASHPOINT("FS_DEVRW", lkdtm_debugfs_entry,
|
||||
"ll_rw_block", jp_ll_rw_block),
|
||||
CRASHPOINT("MEM_SWAPOUT", lkdtm_debugfs_entry,
|
||||
"shrink_inactive_list", jp_shrink_inactive_list),
|
||||
CRASHPOINT("TIMERADD", lkdtm_debugfs_entry,
|
||||
"hrtimer_start", jp_hrtimer_start),
|
||||
CRASHPOINT("SCSI_DISPATCH_CMD", lkdtm_debugfs_entry,
|
||||
"scsi_dispatch_cmd", jp_scsi_dispatch_cmd),
|
||||
CRASHPOINT("INT_HARDWARE_ENTRY", "do_IRQ"),
|
||||
CRASHPOINT("INT_HW_IRQ_EN", "handle_IRQ_event"),
|
||||
CRASHPOINT("INT_TASKLET_ENTRY", "tasklet_action"),
|
||||
CRASHPOINT("FS_DEVRW", "ll_rw_block"),
|
||||
CRASHPOINT("MEM_SWAPOUT", "shrink_inactive_list"),
|
||||
CRASHPOINT("TIMERADD", "hrtimer_start"),
|
||||
CRASHPOINT("SCSI_DISPATCH_CMD", "scsi_dispatch_cmd"),
|
||||
# ifdef CONFIG_IDE
|
||||
CRASHPOINT("IDE_CORE_CP", lkdtm_debugfs_entry,
|
||||
"generic_ide_ioctl", jp_generic_ide_ioctl),
|
||||
CRASHPOINT("IDE_CORE_CP", "generic_ide_ioctl"),
|
||||
# endif
|
||||
#endif
|
||||
};
|
||||
@@ -254,8 +186,8 @@ struct crashtype crashtypes[] = {
|
||||
};
|
||||
|
||||
|
||||
/* Global jprobe entry and crashtype. */
|
||||
static struct jprobe *lkdtm_jprobe;
|
||||
/* Global kprobe entry and crashtype. */
|
||||
static struct kprobe *lkdtm_kprobe;
|
||||
struct crashpoint *lkdtm_crashpoint;
|
||||
struct crashtype *lkdtm_crashtype;
|
||||
|
||||
@@ -298,7 +230,8 @@ static struct crashtype *find_crashtype(const char *name)
|
||||
*/
|
||||
static noinline void lkdtm_do_action(struct crashtype *crashtype)
|
||||
{
|
||||
BUG_ON(!crashtype || !crashtype->func);
|
||||
if (WARN_ON(!crashtype || !crashtype->func))
|
||||
return;
|
||||
crashtype->func();
|
||||
}
|
||||
|
||||
@@ -308,22 +241,22 @@ static int lkdtm_register_cpoint(struct crashpoint *crashpoint,
|
||||
int ret;
|
||||
|
||||
/* If this doesn't have a symbol, just call immediately. */
|
||||
if (!crashpoint->jprobe.kp.symbol_name) {
|
||||
if (!crashpoint->kprobe.symbol_name) {
|
||||
lkdtm_do_action(crashtype);
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (lkdtm_jprobe != NULL)
|
||||
unregister_jprobe(lkdtm_jprobe);
|
||||
if (lkdtm_kprobe != NULL)
|
||||
unregister_kprobe(lkdtm_kprobe);
|
||||
|
||||
lkdtm_crashpoint = crashpoint;
|
||||
lkdtm_crashtype = crashtype;
|
||||
lkdtm_jprobe = &crashpoint->jprobe;
|
||||
ret = register_jprobe(lkdtm_jprobe);
|
||||
lkdtm_kprobe = &crashpoint->kprobe;
|
||||
ret = register_kprobe(lkdtm_kprobe);
|
||||
if (ret < 0) {
|
||||
pr_info("Couldn't register jprobe %s\n",
|
||||
crashpoint->jprobe.kp.symbol_name);
|
||||
lkdtm_jprobe = NULL;
|
||||
pr_info("Couldn't register kprobe %s\n",
|
||||
crashpoint->kprobe.symbol_name);
|
||||
lkdtm_kprobe = NULL;
|
||||
lkdtm_crashpoint = NULL;
|
||||
lkdtm_crashtype = NULL;
|
||||
}
|
||||
@@ -336,13 +269,14 @@ static int lkdtm_register_cpoint(struct crashpoint *crashpoint,
|
||||
static int crash_count = DEFAULT_COUNT;
|
||||
static DEFINE_SPINLOCK(crash_count_lock);
|
||||
|
||||
/* Called by jprobe entry points. */
|
||||
static void lkdtm_handler(void)
|
||||
/* Called by kprobe entry points. */
|
||||
static int lkdtm_kprobe_handler(struct kprobe *kp, struct pt_regs *regs)
|
||||
{
|
||||
unsigned long flags;
|
||||
bool do_it = false;
|
||||
|
||||
BUG_ON(!lkdtm_crashpoint || !lkdtm_crashtype);
|
||||
if (WARN_ON(!lkdtm_crashpoint || !lkdtm_crashtype))
|
||||
return 0;
|
||||
|
||||
spin_lock_irqsave(&crash_count_lock, flags);
|
||||
crash_count--;
|
||||
@@ -357,6 +291,8 @@ static void lkdtm_handler(void)
|
||||
|
||||
if (do_it)
|
||||
lkdtm_do_action(lkdtm_crashtype);
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static ssize_t lkdtm_debugfs_entry(struct file *f,
|
||||
@@ -556,8 +492,8 @@ static void __exit lkdtm_module_exit(void)
|
||||
/* Handle test-specific clean-up. */
|
||||
lkdtm_usercopy_exit();
|
||||
|
||||
if (lkdtm_jprobe != NULL)
|
||||
unregister_jprobe(lkdtm_jprobe);
|
||||
if (lkdtm_kprobe != NULL)
|
||||
unregister_kprobe(lkdtm_kprobe);
|
||||
|
||||
pr_info("Crash point unregistered\n");
|
||||
}
|
||||
|
||||
+16
-20
@@ -391,10 +391,6 @@ int register_kprobes(struct kprobe **kps, int num);
|
||||
void unregister_kprobes(struct kprobe **kps, int num);
|
||||
int setjmp_pre_handler(struct kprobe *, struct pt_regs *);
|
||||
int longjmp_break_handler(struct kprobe *, struct pt_regs *);
|
||||
int register_jprobe(struct jprobe *p);
|
||||
void unregister_jprobe(struct jprobe *p);
|
||||
int register_jprobes(struct jprobe **jps, int num);
|
||||
void unregister_jprobes(struct jprobe **jps, int num);
|
||||
void jprobe_return(void);
|
||||
unsigned long arch_deref_entry_point(void *);
|
||||
|
||||
@@ -443,20 +439,6 @@ static inline void unregister_kprobe(struct kprobe *p)
|
||||
static inline void unregister_kprobes(struct kprobe **kps, int num)
|
||||
{
|
||||
}
|
||||
static inline int register_jprobe(struct jprobe *p)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
static inline int register_jprobes(struct jprobe **jps, int num)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
static inline void unregister_jprobe(struct jprobe *p)
|
||||
{
|
||||
}
|
||||
static inline void unregister_jprobes(struct jprobe **jps, int num)
|
||||
{
|
||||
}
|
||||
static inline void jprobe_return(void)
|
||||
{
|
||||
}
|
||||
@@ -486,6 +468,20 @@ static inline int enable_kprobe(struct kprobe *kp)
|
||||
return -ENOSYS;
|
||||
}
|
||||
#endif /* CONFIG_KPROBES */
|
||||
static inline int register_jprobe(struct jprobe *p)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
static inline int register_jprobes(struct jprobe **jps, int num)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
static inline void unregister_jprobe(struct jprobe *p)
|
||||
{
|
||||
}
|
||||
static inline void unregister_jprobes(struct jprobe **jps, int num)
|
||||
{
|
||||
}
|
||||
static inline int disable_kretprobe(struct kretprobe *rp)
|
||||
{
|
||||
return disable_kprobe(&rp->kp);
|
||||
@@ -496,11 +492,11 @@ static inline int enable_kretprobe(struct kretprobe *rp)
|
||||
}
|
||||
static inline int disable_jprobe(struct jprobe *jp)
|
||||
{
|
||||
return disable_kprobe(&jp->kp);
|
||||
return -ENOSYS;
|
||||
}
|
||||
static inline int enable_jprobe(struct jprobe *jp)
|
||||
{
|
||||
return enable_kprobe(&jp->kp);
|
||||
return -ENOSYS;
|
||||
}
|
||||
|
||||
#ifndef CONFIG_KPROBES
|
||||
|
||||
@@ -485,9 +485,9 @@ struct perf_addr_filters_head {
|
||||
};
|
||||
|
||||
/**
|
||||
* enum perf_event_active_state - the states of a event
|
||||
* enum perf_event_state - the states of a event
|
||||
*/
|
||||
enum perf_event_active_state {
|
||||
enum perf_event_state {
|
||||
PERF_EVENT_STATE_DEAD = -4,
|
||||
PERF_EVENT_STATE_EXIT = -3,
|
||||
PERF_EVENT_STATE_ERROR = -2,
|
||||
@@ -578,7 +578,7 @@ struct perf_event {
|
||||
struct pmu *pmu;
|
||||
void *pmu_private;
|
||||
|
||||
enum perf_event_active_state state;
|
||||
enum perf_event_state state;
|
||||
unsigned int attach_state;
|
||||
local64_t count;
|
||||
atomic64_t child_count;
|
||||
@@ -588,26 +588,10 @@ struct perf_event {
|
||||
* has been enabled (i.e. eligible to run, and the task has
|
||||
* been scheduled in, if this is a per-task event)
|
||||
* and running (scheduled onto the CPU), respectively.
|
||||
*
|
||||
* They are computed from tstamp_enabled, tstamp_running and
|
||||
* tstamp_stopped when the event is in INACTIVE or ACTIVE state.
|
||||
*/
|
||||
u64 total_time_enabled;
|
||||
u64 total_time_running;
|
||||
|
||||
/*
|
||||
* These are timestamps used for computing total_time_enabled
|
||||
* and total_time_running when the event is in INACTIVE or
|
||||
* ACTIVE state, measured in nanoseconds from an arbitrary point
|
||||
* in time.
|
||||
* tstamp_enabled: the notional time when the event was enabled
|
||||
* tstamp_running: the notional time when the event was scheduled on
|
||||
* tstamp_stopped: in INACTIVE state, the notional time when the
|
||||
* event was scheduled off.
|
||||
*/
|
||||
u64 tstamp_enabled;
|
||||
u64 tstamp_running;
|
||||
u64 tstamp_stopped;
|
||||
u64 tstamp;
|
||||
|
||||
/*
|
||||
* timestamp shadows the actual context timing but it can
|
||||
@@ -699,7 +683,6 @@ struct perf_event {
|
||||
|
||||
#ifdef CONFIG_CGROUP_PERF
|
||||
struct perf_cgroup *cgrp; /* cgroup event is attach to */
|
||||
int cgrp_defer_enabled;
|
||||
#endif
|
||||
|
||||
struct list_head sb_list;
|
||||
@@ -806,6 +789,7 @@ struct perf_output_handle {
|
||||
struct bpf_perf_event_data_kern {
|
||||
struct pt_regs *regs;
|
||||
struct perf_sample_data *data;
|
||||
struct perf_event *event;
|
||||
};
|
||||
|
||||
#ifdef CONFIG_CGROUP_PERF
|
||||
@@ -884,7 +868,8 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
|
||||
void *context);
|
||||
extern void perf_pmu_migrate_context(struct pmu *pmu,
|
||||
int src_cpu, int dst_cpu);
|
||||
int perf_event_read_local(struct perf_event *event, u64 *value);
|
||||
int perf_event_read_local(struct perf_event *event, u64 *value,
|
||||
u64 *enabled, u64 *running);
|
||||
extern u64 perf_event_read_value(struct perf_event *event,
|
||||
u64 *enabled, u64 *running);
|
||||
|
||||
@@ -1286,7 +1271,8 @@ static inline const struct perf_event_attr *perf_event_attrs(struct perf_event *
|
||||
{
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
static inline int perf_event_read_local(struct perf_event *event, u64 *value)
|
||||
static inline int perf_event_read_local(struct perf_event *event, u64 *value,
|
||||
u64 *enabled, u64 *running)
|
||||
{
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
@@ -492,7 +492,7 @@ static void *perf_event_fd_array_get_ptr(struct bpf_map *map,
|
||||
|
||||
ee = ERR_PTR(-EOPNOTSUPP);
|
||||
event = perf_file->private_data;
|
||||
if (perf_event_read_local(event, &value) == -EOPNOTSUPP)
|
||||
if (perf_event_read_local(event, &value, NULL, NULL) == -EOPNOTSUPP)
|
||||
goto err_out;
|
||||
|
||||
ee = bpf_event_entry_gen(perf_file, map_file);
|
||||
|
||||
+199
-271
File diff suppressed because it is too large
Load Diff
+11
-7
@@ -117,7 +117,7 @@ enum kprobe_slot_state {
|
||||
SLOT_USED = 2,
|
||||
};
|
||||
|
||||
static void *alloc_insn_page(void)
|
||||
void __weak *alloc_insn_page(void)
|
||||
{
|
||||
return module_alloc(PAGE_SIZE);
|
||||
}
|
||||
@@ -573,13 +573,15 @@ static void kprobe_optimizer(struct work_struct *work)
|
||||
do_unoptimize_kprobes();
|
||||
|
||||
/*
|
||||
* Step 2: Wait for quiesence period to ensure all running interrupts
|
||||
* are done. Because optprobe may modify multiple instructions
|
||||
* there is a chance that Nth instruction is interrupted. In that
|
||||
* case, running interrupt can return to 2nd-Nth byte of jump
|
||||
* instruction. This wait is for avoiding it.
|
||||
* Step 2: Wait for quiesence period to ensure all potentially
|
||||
* preempted tasks to have normally scheduled. Because optprobe
|
||||
* may modify multiple instructions, there is a chance that Nth
|
||||
* instruction is preempted. In that case, such tasks can return
|
||||
* to 2nd-Nth byte of jump instruction. This wait is for avoiding it.
|
||||
* Note that on non-preemptive kernel, this is transparently converted
|
||||
* to synchronoze_sched() to wait for all interrupts to have completed.
|
||||
*/
|
||||
synchronize_sched();
|
||||
synchronize_rcu_tasks();
|
||||
|
||||
/* Step 3: Optimize kprobes after quiesence period */
|
||||
do_optimize_kprobes();
|
||||
@@ -1769,6 +1771,7 @@ unsigned long __weak arch_deref_entry_point(void *entry)
|
||||
return (unsigned long)entry;
|
||||
}
|
||||
|
||||
#if 0
|
||||
int register_jprobes(struct jprobe **jps, int num)
|
||||
{
|
||||
int ret = 0, i;
|
||||
@@ -1837,6 +1840,7 @@ void unregister_jprobes(struct jprobe **jps, int num)
|
||||
}
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(unregister_jprobes);
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
/*
|
||||
|
||||
+28
-1
@@ -22,7 +22,7 @@
|
||||
|
||||
#define div_factor 3
|
||||
|
||||
static u32 rand1, preh_val, posth_val, jph_val;
|
||||
static u32 rand1, preh_val, posth_val;
|
||||
static int errors, handler_errors, num_tests;
|
||||
static u32 (*target)(u32 value);
|
||||
static u32 (*target2)(u32 value);
|
||||
@@ -34,6 +34,10 @@ static noinline u32 kprobe_target(u32 value)
|
||||
|
||||
static int kp_pre_handler(struct kprobe *p, struct pt_regs *regs)
|
||||
{
|
||||
if (preemptible()) {
|
||||
handler_errors++;
|
||||
pr_err("pre-handler is preemptible\n");
|
||||
}
|
||||
preh_val = (rand1 / div_factor);
|
||||
return 0;
|
||||
}
|
||||
@@ -41,6 +45,10 @@ static int kp_pre_handler(struct kprobe *p, struct pt_regs *regs)
|
||||
static void kp_post_handler(struct kprobe *p, struct pt_regs *regs,
|
||||
unsigned long flags)
|
||||
{
|
||||
if (preemptible()) {
|
||||
handler_errors++;
|
||||
pr_err("post-handler is preemptible\n");
|
||||
}
|
||||
if (preh_val != (rand1 / div_factor)) {
|
||||
handler_errors++;
|
||||
pr_err("incorrect value in post_handler\n");
|
||||
@@ -154,8 +162,15 @@ static int test_kprobes(void)
|
||||
|
||||
}
|
||||
|
||||
#if 0
|
||||
static u32 jph_val;
|
||||
|
||||
static u32 j_kprobe_target(u32 value)
|
||||
{
|
||||
if (preemptible()) {
|
||||
handler_errors++;
|
||||
pr_err("jprobe-handler is preemptible\n");
|
||||
}
|
||||
if (value != rand1) {
|
||||
handler_errors++;
|
||||
pr_err("incorrect value in jprobe handler\n");
|
||||
@@ -227,11 +242,19 @@ static int test_jprobes(void)
|
||||
|
||||
return 0;
|
||||
}
|
||||
#else
|
||||
#define test_jprobe() (0)
|
||||
#define test_jprobes() (0)
|
||||
#endif
|
||||
#ifdef CONFIG_KRETPROBES
|
||||
static u32 krph_val;
|
||||
|
||||
static int entry_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
|
||||
{
|
||||
if (preemptible()) {
|
||||
handler_errors++;
|
||||
pr_err("kretprobe entry handler is preemptible\n");
|
||||
}
|
||||
krph_val = (rand1 / div_factor);
|
||||
return 0;
|
||||
}
|
||||
@@ -240,6 +263,10 @@ static int return_handler(struct kretprobe_instance *ri, struct pt_regs *regs)
|
||||
{
|
||||
unsigned long ret = regs_return_value(regs);
|
||||
|
||||
if (preemptible()) {
|
||||
handler_errors++;
|
||||
pr_err("kretprobe return handler is preemptible\n");
|
||||
}
|
||||
if (ret != (rand1 / div_factor)) {
|
||||
handler_errors++;
|
||||
pr_err("incorrect value in kretprobe handler\n");
|
||||
|
||||
@@ -275,7 +275,7 @@ BPF_CALL_2(bpf_perf_event_read, struct bpf_map *, map, u64, flags)
|
||||
if (!ee)
|
||||
return -ENOENT;
|
||||
|
||||
err = perf_event_read_local(ee->event, &value);
|
||||
err = perf_event_read_local(ee->event, &value, NULL, NULL);
|
||||
/*
|
||||
* this api is ugly since we miss [-22..-2] range of valid
|
||||
* counter values, but that's uapi
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
# builds the kprobes example kernel modules;
|
||||
# then to use one (as root): insmod <module_name.ko>
|
||||
|
||||
obj-$(CONFIG_SAMPLE_KPROBES) += kprobe_example.o jprobe_example.o
|
||||
obj-$(CONFIG_SAMPLE_KPROBES) += kprobe_example.o
|
||||
obj-$(CONFIG_SAMPLE_KRETPROBES) += kretprobe_example.o
|
||||
|
||||
@@ -1,67 +0,0 @@
|
||||
/*
|
||||
* Here's a sample kernel module showing the use of jprobes to dump
|
||||
* the arguments of _do_fork().
|
||||
*
|
||||
* For more information on theory of operation of jprobes, see
|
||||
* Documentation/kprobes.txt
|
||||
*
|
||||
* Build and insert the kernel module as done in the kprobe example.
|
||||
* You will see the trace data in /var/log/messages and on the
|
||||
* console whenever _do_fork() is invoked to create a new process.
|
||||
* (Some messages may be suppressed if syslogd is configured to
|
||||
* eliminate duplicate messages.)
|
||||
*/
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/kprobes.h>
|
||||
|
||||
/*
|
||||
* Jumper probe for _do_fork.
|
||||
* Mirror principle enables access to arguments of the probed routine
|
||||
* from the probe handler.
|
||||
*/
|
||||
|
||||
/* Proxy routine having the same arguments as actual _do_fork() routine */
|
||||
static long j_do_fork(unsigned long clone_flags, unsigned long stack_start,
|
||||
unsigned long stack_size, int __user *parent_tidptr,
|
||||
int __user *child_tidptr, unsigned long tls)
|
||||
{
|
||||
pr_info("jprobe: clone_flags = 0x%lx, stack_start = 0x%lx "
|
||||
"stack_size = 0x%lx\n", clone_flags, stack_start, stack_size);
|
||||
|
||||
/* Always end with a call to jprobe_return(). */
|
||||
jprobe_return();
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct jprobe my_jprobe = {
|
||||
.entry = j_do_fork,
|
||||
.kp = {
|
||||
.symbol_name = "_do_fork",
|
||||
},
|
||||
};
|
||||
|
||||
static int __init jprobe_init(void)
|
||||
{
|
||||
int ret;
|
||||
|
||||
ret = register_jprobe(&my_jprobe);
|
||||
if (ret < 0) {
|
||||
pr_err("register_jprobe failed, returned %d\n", ret);
|
||||
return -1;
|
||||
}
|
||||
pr_info("Planted jprobe at %p, handler addr %p\n",
|
||||
my_jprobe.kp.addr, my_jprobe.entry);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void __exit jprobe_exit(void)
|
||||
{
|
||||
unregister_jprobe(&my_jprobe);
|
||||
pr_info("jprobe at %p unregistered\n", my_jprobe.kp.addr);
|
||||
}
|
||||
|
||||
module_init(jprobe_init)
|
||||
module_exit(jprobe_exit)
|
||||
MODULE_LICENSE("GPL");
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user