mirror of
https://github.com/armbian/linux-cix.git
synced 2026-01-06 12:30:45 -08:00
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
"147 patches, based on 7d2a07b769.
Subsystems affected by this patch series: mm (memory-hotplug, rmap,
ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
selftests, ipc, and scripts"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (94 commits)
scripts: check_extable: fix typo in user error message
mm/workingset: correct kernel-doc notations
ipc: replace costly bailout check in sysvipc_find_ipc()
selftests/memfd: remove unused variable
Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
configs: remove the obsolete CONFIG_INPUT_POLLDEV
prctl: allow to setup brk for et_dyn executables
pid: cleanup the stale comment mentioning pidmap_init().
kernel/fork.c: unexport get_{mm,task}_exe_file
coredump: fix memleak in dump_vma_snapshot()
fs/coredump.c: log if a core dump is aborted due to changed file permissions
nilfs2: use refcount_dec_and_lock() to fix potential UAF
nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
nilfs2: fix NULL pointer in nilfs_##name##_attr_release
nilfs2: fix memory leak in nilfs_sysfs_create_device_group
trap: cleanup trap_init()
init: move usermodehelper_enable() to populate_rootfs()
...
This commit is contained in:
15
Documentation/admin-guide/mm/damon/index.rst
Normal file
15
Documentation/admin-guide/mm/damon/index.rst
Normal file
@@ -0,0 +1,15 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
========================
|
||||
Monitoring Data Accesses
|
||||
========================
|
||||
|
||||
:doc:`DAMON </vm/damon/index>` allows light-weight data access monitoring.
|
||||
Using DAMON, users can analyze the memory access patterns of their systems and
|
||||
optimize those.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
start
|
||||
usage
|
||||
114
Documentation/admin-guide/mm/damon/start.rst
Normal file
114
Documentation/admin-guide/mm/damon/start.rst
Normal file
@@ -0,0 +1,114 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===============
|
||||
Getting Started
|
||||
===============
|
||||
|
||||
This document briefly describes how you can use DAMON by demonstrating its
|
||||
default user space tool. Please note that this document describes only a part
|
||||
of its features for brevity. Please refer to :doc:`usage` for more details.
|
||||
|
||||
|
||||
TL; DR
|
||||
======
|
||||
|
||||
Follow the commands below to monitor and visualize the memory access pattern of
|
||||
your workload. ::
|
||||
|
||||
# # build the kernel with CONFIG_DAMON_*=y, install it, and reboot
|
||||
# mount -t debugfs none /sys/kernel/debug/
|
||||
# git clone https://github.com/awslabs/damo
|
||||
# ./damo/damo record $(pidof <your workload>)
|
||||
# ./damo/damo report heat --plot_ascii
|
||||
|
||||
The final command draws the access heatmap of ``<your workload>``. The heatmap
|
||||
shows which memory region (x-axis) is accessed when (y-axis) and how frequently
|
||||
(number; the higher the more accesses have been observed). ::
|
||||
|
||||
111111111111111111111111111111111111111111111111111111110000
|
||||
111121111111111111111111111111211111111111111111111111110000
|
||||
000000000000000000000000000000000000000000000000001555552000
|
||||
000000000000000000000000000000000000000000000222223555552000
|
||||
000000000000000000000000000000000000000011111677775000000000
|
||||
000000000000000000000000000000000000000488888000000000000000
|
||||
000000000000000000000000000000000177888400000000000000000000
|
||||
000000000000000000000000000046666522222100000000000000000000
|
||||
000000000000000000000014444344444300000000000000000000000000
|
||||
000000000000000002222245555510000000000000000000000000000000
|
||||
# access_frequency: 0 1 2 3 4 5 6 7 8 9
|
||||
# x-axis: space (140286319947776-140286426374096: 101.496 MiB)
|
||||
# y-axis: time (605442256436361-605479951866441: 37.695430s)
|
||||
# resolution: 60x10 (1.692 MiB and 3.770s for each character)
|
||||
|
||||
|
||||
Prerequisites
|
||||
=============
|
||||
|
||||
Kernel
|
||||
------
|
||||
|
||||
You should first ensure your system is running on a kernel built with
|
||||
``CONFIG_DAMON_*=y``.
|
||||
|
||||
|
||||
User Space Tool
|
||||
---------------
|
||||
|
||||
For the demonstration, we will use the default user space tool for DAMON,
|
||||
called DAMON Operator (DAMO). It is available at
|
||||
https://github.com/awslabs/damo. The examples below assume that ``damo`` is on
|
||||
your ``$PATH``. It's not mandatory, though.
|
||||
|
||||
Because DAMO is using the debugfs interface (refer to :doc:`usage` for the
|
||||
detail) of DAMON, you should ensure debugfs is mounted. Mount it manually as
|
||||
below::
|
||||
|
||||
# mount -t debugfs none /sys/kernel/debug/
|
||||
|
||||
or append the following line to your ``/etc/fstab`` file so that your system
|
||||
can automatically mount debugfs upon booting::
|
||||
|
||||
debugfs /sys/kernel/debug debugfs defaults 0 0
|
||||
|
||||
|
||||
Recording Data Access Patterns
|
||||
==============================
|
||||
|
||||
The commands below record the memory access patterns of a program and save the
|
||||
monitoring results to a file. ::
|
||||
|
||||
$ git clone https://github.com/sjp38/masim
|
||||
$ cd masim; make; ./masim ./configs/zigzag.cfg &
|
||||
$ sudo damo record -o damon.data $(pidof masim)
|
||||
|
||||
The first two lines of the commands download an artificial memory access
|
||||
generator program and run it in the background. The generator will repeatedly
|
||||
access two 100 MiB sized memory regions one by one. You can substitute this
|
||||
with your real workload. The last line asks ``damo`` to record the access
|
||||
pattern in the ``damon.data`` file.
|
||||
|
||||
|
||||
Visualizing Recorded Patterns
|
||||
=============================
|
||||
|
||||
The following three commands visualize the recorded access patterns and save
|
||||
the results as separate image files. ::
|
||||
|
||||
$ damo report heats --heatmap access_pattern_heatmap.png
|
||||
$ damo report wss --range 0 101 1 --plot wss_dist.png
|
||||
$ damo report wss --range 0 101 1 --sortby time --plot wss_chron_change.png
|
||||
|
||||
- ``access_pattern_heatmap.png`` will visualize the data access pattern in a
|
||||
heatmap, showing which memory region (y-axis) got accessed when (x-axis)
|
||||
and how frequently (color).
|
||||
- ``wss_dist.png`` will show the distribution of the working set size.
|
||||
- ``wss_chron_change.png`` will show how the working set size has
|
||||
chronologically changed.
|
||||
|
||||
You can view the visualizations of this example workload at [1]_.
|
||||
Visualizations of other realistic workloads are available at [2]_ [3]_ [4]_.
|
||||
|
||||
.. [1] https://damonitor.github.io/doc/html/v17/admin-guide/mm/damon/start.html#visualizing-recorded-patterns
|
||||
.. [2] https://damonitor.github.io/test/result/visual/latest/rec.heatmap.1.png.html
|
||||
.. [3] https://damonitor.github.io/test/result/visual/latest/rec.wss_sz.png.html
|
||||
.. [4] https://damonitor.github.io/test/result/visual/latest/rec.wss_time.png.html
|
||||
112
Documentation/admin-guide/mm/damon/usage.rst
Normal file
112
Documentation/admin-guide/mm/damon/usage.rst
Normal file
@@ -0,0 +1,112 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===============
|
||||
Detailed Usages
|
||||
===============
|
||||
|
||||
DAMON provides below three interfaces for different users.
|
||||
|
||||
- *DAMON user space tool.*
|
||||
This is for privileged people such as system administrators who want a
|
||||
just-working human-friendly interface. Using this, users can use the DAMON’s
|
||||
major features in a human-friendly way. It may not be highly tuned for
|
||||
special cases, though. It supports only virtual address spaces monitoring.
|
||||
- *debugfs interface.*
|
||||
This is for privileged user space programmers who want more optimized use of
|
||||
DAMON. Using this, users can use DAMON’s major features by reading
|
||||
from and writing to special debugfs files. Therefore, you can write and use
|
||||
your personalized DAMON debugfs wrapper programs that reads/writes the
|
||||
debugfs files instead of you. The DAMON user space tool is also a reference
|
||||
implementation of such programs. It supports only virtual address spaces
|
||||
monitoring.
|
||||
- *Kernel Space Programming Interface.*
|
||||
This is for kernel space programmers. Using this, users can utilize every
|
||||
feature of DAMON most flexibly and efficiently by writing kernel space
|
||||
DAMON application programs for you. You can even extend DAMON for various
|
||||
address spaces.
|
||||
|
||||
Nevertheless, you could write your own user space tool using the debugfs
|
||||
interface. A reference implementation is available at
|
||||
https://github.com/awslabs/damo. If you are a kernel programmer, you could
|
||||
refer to :doc:`/vm/damon/api` for the kernel space programming interface. For
|
||||
the reason, this document describes only the debugfs interface
|
||||
|
||||
debugfs Interface
|
||||
=================
|
||||
|
||||
DAMON exports three files, ``attrs``, ``target_ids``, and ``monitor_on`` under
|
||||
its debugfs directory, ``<debugfs>/damon/``.
|
||||
|
||||
|
||||
Attributes
|
||||
----------
|
||||
|
||||
Users can get and set the ``sampling interval``, ``aggregation interval``,
|
||||
``regions update interval``, and min/max number of monitoring target regions by
|
||||
reading from and writing to the ``attrs`` file. To know about the monitoring
|
||||
attributes in detail, please refer to the :doc:`/vm/damon/design`. For
|
||||
example, below commands set those values to 5 ms, 100 ms, 1,000 ms, 10 and
|
||||
1000, and then check it again::
|
||||
|
||||
# cd <debugfs>/damon
|
||||
# echo 5000 100000 1000000 10 1000 > attrs
|
||||
# cat attrs
|
||||
5000 100000 1000000 10 1000
|
||||
|
||||
|
||||
Target IDs
|
||||
----------
|
||||
|
||||
Some types of address spaces supports multiple monitoring target. For example,
|
||||
the virtual memory address spaces monitoring can have multiple processes as the
|
||||
monitoring targets. Users can set the targets by writing relevant id values of
|
||||
the targets to, and get the ids of the current targets by reading from the
|
||||
``target_ids`` file. In case of the virtual address spaces monitoring, the
|
||||
values should be pids of the monitoring target processes. For example, below
|
||||
commands set processes having pids 42 and 4242 as the monitoring targets and
|
||||
check it again::
|
||||
|
||||
# cd <debugfs>/damon
|
||||
# echo 42 4242 > target_ids
|
||||
# cat target_ids
|
||||
42 4242
|
||||
|
||||
Note that setting the target ids doesn't start the monitoring.
|
||||
|
||||
|
||||
Turning On/Off
|
||||
--------------
|
||||
|
||||
Setting the files as described above doesn't incur effect unless you explicitly
|
||||
start the monitoring. You can start, stop, and check the current status of the
|
||||
monitoring by writing to and reading from the ``monitor_on`` file. Writing
|
||||
``on`` to the file starts the monitoring of the targets with the attributes.
|
||||
Writing ``off`` to the file stops those. DAMON also stops if every target
|
||||
process is terminated. Below example commands turn on, off, and check the
|
||||
status of DAMON::
|
||||
|
||||
# cd <debugfs>/damon
|
||||
# echo on > monitor_on
|
||||
# echo off > monitor_on
|
||||
# cat monitor_on
|
||||
off
|
||||
|
||||
Please note that you cannot write to the above-mentioned debugfs files while
|
||||
the monitoring is turned on. If you write to the files while DAMON is running,
|
||||
an error code such as ``-EBUSY`` will be returned.
|
||||
|
||||
|
||||
Tracepoint for Monitoring Results
|
||||
=================================
|
||||
|
||||
DAMON provides the monitoring results via a tracepoint,
|
||||
``damon:damon_aggregated``. While the monitoring is turned on, you could
|
||||
record the tracepoint events and show results using tracepoint supporting tools
|
||||
like ``perf``. For example::
|
||||
|
||||
# echo on > monitor_on
|
||||
# perf record -e damon:damon_aggregated &
|
||||
# sleep 5
|
||||
# kill 9 $(pidof perf)
|
||||
# echo off > monitor_on
|
||||
# perf script
|
||||
@@ -27,6 +27,7 @@ the Linux memory management.
|
||||
|
||||
concepts
|
||||
cma_debugfs
|
||||
damon/index
|
||||
hugetlbpage
|
||||
idle_page_tracking
|
||||
ksm
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -65,25 +65,27 @@ Error reports
|
||||
A typical out-of-bounds access looks like this::
|
||||
|
||||
==================================================================
|
||||
BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa3/0x22b
|
||||
BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0xa6/0x234
|
||||
|
||||
Out-of-bounds read at 0xffffffffb672efff (1B left of kfence-#17):
|
||||
test_out_of_bounds_read+0xa3/0x22b
|
||||
kunit_try_run_case+0x51/0x85
|
||||
Out-of-bounds read at 0xffff8c3f2e291fff (1B left of kfence-#72):
|
||||
test_out_of_bounds_read+0xa6/0x234
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
kfence-#17 [0xffffffffb672f000-0xffffffffb672f01f, size=32, cache=kmalloc-32] allocated by task 507:
|
||||
test_alloc+0xf3/0x25b
|
||||
test_out_of_bounds_read+0x98/0x22b
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kfence-#72: 0xffff8c3f2e292000-0xffff8c3f2e29201f, size=32, cache=kmalloc-32
|
||||
|
||||
allocated by task 484 on cpu 0 at 32.919330s:
|
||||
test_alloc+0xfe/0x738
|
||||
test_out_of_bounds_read+0x9b/0x234
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
CPU: 4 PID: 107 Comm: kunit_try_catch Not tainted 5.8.0-rc6+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
|
||||
CPU: 0 PID: 484 Comm: kunit_try_catch Not tainted 5.13.0-rc3+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
|
||||
==================================================================
|
||||
|
||||
The header of the report provides a short summary of the function involved in
|
||||
@@ -96,30 +98,32 @@ Use-after-free accesses are reported as::
|
||||
==================================================================
|
||||
BUG: KFENCE: use-after-free read in test_use_after_free_read+0xb3/0x143
|
||||
|
||||
Use-after-free read at 0xffffffffb673dfe0 (in kfence-#24):
|
||||
Use-after-free read at 0xffff8c3f2e2a0000 (in kfence-#79):
|
||||
test_use_after_free_read+0xb3/0x143
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
kfence-#24 [0xffffffffb673dfe0-0xffffffffb673dfff, size=32, cache=kmalloc-32] allocated by task 507:
|
||||
test_alloc+0xf3/0x25b
|
||||
kfence-#79: 0xffff8c3f2e2a0000-0xffff8c3f2e2a001f, size=32, cache=kmalloc-32
|
||||
|
||||
allocated by task 488 on cpu 2 at 33.871326s:
|
||||
test_alloc+0xfe/0x738
|
||||
test_use_after_free_read+0x76/0x143
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
freed by task 507:
|
||||
freed by task 488 on cpu 2 at 33.871358s:
|
||||
test_use_after_free_read+0xa8/0x143
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
CPU: 4 PID: 109 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
|
||||
CPU: 2 PID: 488 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
|
||||
==================================================================
|
||||
|
||||
KFENCE also reports on invalid frees, such as double-frees::
|
||||
@@ -127,30 +131,32 @@ KFENCE also reports on invalid frees, such as double-frees::
|
||||
==================================================================
|
||||
BUG: KFENCE: invalid free in test_double_free+0xdc/0x171
|
||||
|
||||
Invalid free of 0xffffffffb6741000:
|
||||
Invalid free of 0xffff8c3f2e2a4000 (in kfence-#81):
|
||||
test_double_free+0xdc/0x171
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
kfence-#26 [0xffffffffb6741000-0xffffffffb674101f, size=32, cache=kmalloc-32] allocated by task 507:
|
||||
test_alloc+0xf3/0x25b
|
||||
kfence-#81: 0xffff8c3f2e2a4000-0xffff8c3f2e2a401f, size=32, cache=kmalloc-32
|
||||
|
||||
allocated by task 490 on cpu 1 at 34.175321s:
|
||||
test_alloc+0xfe/0x738
|
||||
test_double_free+0x76/0x171
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
freed by task 507:
|
||||
freed by task 490 on cpu 1 at 34.175348s:
|
||||
test_double_free+0xa8/0x171
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
CPU: 4 PID: 111 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
|
||||
CPU: 1 PID: 490 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
|
||||
==================================================================
|
||||
|
||||
KFENCE also uses pattern-based redzones on the other side of an object's guard
|
||||
@@ -160,23 +166,25 @@ These are reported on frees::
|
||||
==================================================================
|
||||
BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0xef/0x184
|
||||
|
||||
Corrupted memory at 0xffffffffb6797ff9 [ 0xac . . . . . . ] (in kfence-#69):
|
||||
Corrupted memory at 0xffff8c3f2e33aff9 [ 0xac . . . . . . ] (in kfence-#156):
|
||||
test_kmalloc_aligned_oob_write+0xef/0x184
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
kfence-#69 [0xffffffffb6797fb0-0xffffffffb6797ff8, size=73, cache=kmalloc-96] allocated by task 507:
|
||||
test_alloc+0xf3/0x25b
|
||||
kfence-#156: 0xffff8c3f2e33afb0-0xffff8c3f2e33aff8, size=73, cache=kmalloc-96
|
||||
|
||||
allocated by task 502 on cpu 7 at 42.159302s:
|
||||
test_alloc+0xfe/0x738
|
||||
test_kmalloc_aligned_oob_write+0x57/0x184
|
||||
kunit_try_run_case+0x51/0x85
|
||||
kunit_try_run_case+0x61/0xa0
|
||||
kunit_generic_run_threadfn_adapter+0x16/0x30
|
||||
kthread+0x137/0x160
|
||||
kthread+0x176/0x1b0
|
||||
ret_from_fork+0x22/0x30
|
||||
|
||||
CPU: 4 PID: 120 Comm: kunit_try_catch Tainted: G W 5.8.0-rc6+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
|
||||
CPU: 7 PID: 502 Comm: kunit_try_catch Tainted: G B 5.13.0-rc3+ #7
|
||||
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
|
||||
==================================================================
|
||||
|
||||
For such errors, the address where the corruption occurred as well as the
|
||||
|
||||
@@ -130,9 +130,10 @@ Getting Help
|
||||
------------
|
||||
|
||||
- `Website <https://clangbuiltlinux.github.io/>`_
|
||||
- `Mailing List <https://groups.google.com/forum/#!forum/clang-built-linux>`_: <clang-built-linux@googlegroups.com>
|
||||
- `Mailing List <https://lore.kernel.org/llvm/>`_: <llvm@lists.linux.dev>
|
||||
- `Old Mailing List Archives <https://groups.google.com/g/clang-built-linux>`_
|
||||
- `Issue Tracker <https://github.com/ClangBuiltLinux/linux/issues>`_
|
||||
- IRC: #clangbuiltlinux on chat.freenode.net
|
||||
- IRC: #clangbuiltlinux on irc.libera.chat
|
||||
- `Telegram <https://t.me/ClangBuiltLinux>`_: @ClangBuiltLinux
|
||||
- `Wiki <https://github.com/ClangBuiltLinux/linux/wiki>`_
|
||||
- `Beginner Bugs <https://github.com/ClangBuiltLinux/linux/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22>`_
|
||||
|
||||
20
Documentation/vm/damon/api.rst
Normal file
20
Documentation/vm/damon/api.rst
Normal file
@@ -0,0 +1,20 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=============
|
||||
API Reference
|
||||
=============
|
||||
|
||||
Kernel space programs can use every feature of DAMON using below APIs. All you
|
||||
need to do is including ``damon.h``, which is located in ``include/linux/`` of
|
||||
the source tree.
|
||||
|
||||
Structures
|
||||
==========
|
||||
|
||||
.. kernel-doc:: include/linux/damon.h
|
||||
|
||||
|
||||
Functions
|
||||
=========
|
||||
|
||||
.. kernel-doc:: mm/damon/core.c
|
||||
166
Documentation/vm/damon/design.rst
Normal file
166
Documentation/vm/damon/design.rst
Normal file
@@ -0,0 +1,166 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======
|
||||
Design
|
||||
======
|
||||
|
||||
Configurable Layers
|
||||
===================
|
||||
|
||||
DAMON provides data access monitoring functionality while making the accuracy
|
||||
and the overhead controllable. The fundamental access monitorings require
|
||||
primitives that dependent on and optimized for the target address space. On
|
||||
the other hand, the accuracy and overhead tradeoff mechanism, which is the core
|
||||
of DAMON, is in the pure logic space. DAMON separates the two parts in
|
||||
different layers and defines its interface to allow various low level
|
||||
primitives implementations configurable with the core logic.
|
||||
|
||||
Due to this separated design and the configurable interface, users can extend
|
||||
DAMON for any address space by configuring the core logics with appropriate low
|
||||
level primitive implementations. If appropriate one is not provided, users can
|
||||
implement the primitives on their own.
|
||||
|
||||
For example, physical memory, virtual memory, swap space, those for specific
|
||||
processes, NUMA nodes, files, and backing memory devices would be supportable.
|
||||
Also, if some architectures or devices support special optimized access check
|
||||
primitives, those will be easily configurable.
|
||||
|
||||
|
||||
Reference Implementations of Address Space Specific Primitives
|
||||
==============================================================
|
||||
|
||||
The low level primitives for the fundamental access monitoring are defined in
|
||||
two parts:
|
||||
|
||||
1. Identification of the monitoring target address range for the address space.
|
||||
2. Access check of specific address range in the target space.
|
||||
|
||||
DAMON currently provides the implementation of the primitives for only the
|
||||
virtual address spaces. Below two subsections describe how it works.
|
||||
|
||||
|
||||
VMA-based Target Address Range Construction
|
||||
-------------------------------------------
|
||||
|
||||
Only small parts in the super-huge virtual address space of the processes are
|
||||
mapped to the physical memory and accessed. Thus, tracking the unmapped
|
||||
address regions is just wasteful. However, because DAMON can deal with some
|
||||
level of noise using the adaptive regions adjustment mechanism, tracking every
|
||||
mapping is not strictly required but could even incur a high overhead in some
|
||||
cases. That said, too huge unmapped areas inside the monitoring target should
|
||||
be removed to not take the time for the adaptive mechanism.
|
||||
|
||||
For the reason, this implementation converts the complex mappings to three
|
||||
distinct regions that cover every mapped area of the address space. The two
|
||||
gaps between the three regions are the two biggest unmapped areas in the given
|
||||
address space. The two biggest unmapped areas would be the gap between the
|
||||
heap and the uppermost mmap()-ed region, and the gap between the lowermost
|
||||
mmap()-ed region and the stack in most of the cases. Because these gaps are
|
||||
exceptionally huge in usual address spaces, excluding these will be sufficient
|
||||
to make a reasonable trade-off. Below shows this in detail::
|
||||
|
||||
<heap>
|
||||
<BIG UNMAPPED REGION 1>
|
||||
<uppermost mmap()-ed region>
|
||||
(small mmap()-ed regions and munmap()-ed regions)
|
||||
<lowermost mmap()-ed region>
|
||||
<BIG UNMAPPED REGION 2>
|
||||
<stack>
|
||||
|
||||
|
||||
PTE Accessed-bit Based Access Check
|
||||
-----------------------------------
|
||||
|
||||
The implementation for the virtual address space uses PTE Accessed-bit for
|
||||
basic access checks. It finds the relevant PTE Accessed bit from the address
|
||||
by walking the page table for the target task of the address. In this way, the
|
||||
implementation finds and clears the bit for next sampling target address and
|
||||
checks whether the bit set again after one sampling period. This could disturb
|
||||
other kernel subsystems using the Accessed bits, namely Idle page tracking and
|
||||
the reclaim logic. To avoid such disturbances, DAMON makes it mutually
|
||||
exclusive with Idle page tracking and uses ``PG_idle`` and ``PG_young`` page
|
||||
flags to solve the conflict with the reclaim logic, as Idle page tracking does.
|
||||
|
||||
|
||||
Address Space Independent Core Mechanisms
|
||||
=========================================
|
||||
|
||||
Below four sections describe each of the DAMON core mechanisms and the five
|
||||
monitoring attributes, ``sampling interval``, ``aggregation interval``,
|
||||
``regions update interval``, ``minimum number of regions``, and ``maximum
|
||||
number of regions``.
|
||||
|
||||
|
||||
Access Frequency Monitoring
|
||||
---------------------------
|
||||
|
||||
The output of DAMON says what pages are how frequently accessed for a given
|
||||
duration. The resolution of the access frequency is controlled by setting
|
||||
``sampling interval`` and ``aggregation interval``. In detail, DAMON checks
|
||||
access to each page per ``sampling interval`` and aggregates the results. In
|
||||
other words, counts the number of the accesses to each page. After each
|
||||
``aggregation interval`` passes, DAMON calls callback functions that previously
|
||||
registered by users so that users can read the aggregated results and then
|
||||
clears the results. This can be described in below simple pseudo-code::
|
||||
|
||||
while monitoring_on:
|
||||
for page in monitoring_target:
|
||||
if accessed(page):
|
||||
nr_accesses[page] += 1
|
||||
if time() % aggregation_interval == 0:
|
||||
for callback in user_registered_callbacks:
|
||||
callback(monitoring_target, nr_accesses)
|
||||
for page in monitoring_target:
|
||||
nr_accesses[page] = 0
|
||||
sleep(sampling interval)
|
||||
|
||||
The monitoring overhead of this mechanism will arbitrarily increase as the
|
||||
size of the target workload grows.
|
||||
|
||||
|
||||
Region Based Sampling
|
||||
---------------------
|
||||
|
||||
To avoid the unbounded increase of the overhead, DAMON groups adjacent pages
|
||||
that assumed to have the same access frequencies into a region. As long as the
|
||||
assumption (pages in a region have the same access frequencies) is kept, only
|
||||
one page in the region is required to be checked. Thus, for each ``sampling
|
||||
interval``, DAMON randomly picks one page in each region, waits for one
|
||||
``sampling interval``, checks whether the page is accessed meanwhile, and
|
||||
increases the access frequency of the region if so. Therefore, the monitoring
|
||||
overhead is controllable by setting the number of regions. DAMON allows users
|
||||
to set the minimum and the maximum number of regions for the trade-off.
|
||||
|
||||
This scheme, however, cannot preserve the quality of the output if the
|
||||
assumption is not guaranteed.
|
||||
|
||||
|
||||
Adaptive Regions Adjustment
|
||||
---------------------------
|
||||
|
||||
Even somehow the initial monitoring target regions are well constructed to
|
||||
fulfill the assumption (pages in same region have similar access frequencies),
|
||||
the data access pattern can be dynamically changed. This will result in low
|
||||
monitoring quality. To keep the assumption as much as possible, DAMON
|
||||
adaptively merges and splits each region based on their access frequency.
|
||||
|
||||
For each ``aggregation interval``, it compares the access frequencies of
|
||||
adjacent regions and merges those if the frequency difference is small. Then,
|
||||
after it reports and clears the aggregated access frequency of each region, it
|
||||
splits each region into two or three regions if the total number of regions
|
||||
will not exceed the user-specified maximum number of regions after the split.
|
||||
|
||||
In this way, DAMON provides its best-effort quality and minimal overhead while
|
||||
keeping the bounds users set for their trade-off.
|
||||
|
||||
|
||||
Dynamic Target Space Updates Handling
|
||||
-------------------------------------
|
||||
|
||||
The monitoring target address range could dynamically changed. For example,
|
||||
virtual memory could be dynamically mapped and unmapped. Physical memory could
|
||||
be hot-plugged.
|
||||
|
||||
As the changes could be quite frequent in some cases, DAMON checks the dynamic
|
||||
memory mapping changes and applies it to the abstracted target area only for
|
||||
each of a user-specified time interval (``regions update interval``).
|
||||
51
Documentation/vm/damon/faq.rst
Normal file
51
Documentation/vm/damon/faq.rst
Normal file
@@ -0,0 +1,51 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==========================
|
||||
Frequently Asked Questions
|
||||
==========================
|
||||
|
||||
Why a new subsystem, instead of extending perf or other user space tools?
|
||||
=========================================================================
|
||||
|
||||
First, because it needs to be lightweight as much as possible so that it can be
|
||||
used online, any unnecessary overhead such as kernel - user space context
|
||||
switching cost should be avoided. Second, DAMON aims to be used by other
|
||||
programs including the kernel. Therefore, having a dependency on specific
|
||||
tools like perf is not desirable. These are the two biggest reasons why DAMON
|
||||
is implemented in the kernel space.
|
||||
|
||||
|
||||
Can 'idle pages tracking' or 'perf mem' substitute DAMON?
|
||||
=========================================================
|
||||
|
||||
Idle page tracking is a low level primitive for access check of the physical
|
||||
address space. 'perf mem' is similar, though it can use sampling to minimize
|
||||
the overhead. On the other hand, DAMON is a higher-level framework for the
|
||||
monitoring of various address spaces. It is focused on memory management
|
||||
optimization and provides sophisticated accuracy/overhead handling mechanisms.
|
||||
Therefore, 'idle pages tracking' and 'perf mem' could provide a subset of
|
||||
DAMON's output, but cannot substitute DAMON.
|
||||
|
||||
|
||||
Does DAMON support virtual memory only?
|
||||
=======================================
|
||||
|
||||
No. The core of the DAMON is address space independent. The address space
|
||||
specific low level primitive parts including monitoring target regions
|
||||
constructions and actual access checks can be implemented and configured on the
|
||||
DAMON core by the users. In this way, DAMON users can monitor any address
|
||||
space with any access check technique.
|
||||
|
||||
Nonetheless, DAMON provides vma tracking and PTE Accessed bit check based
|
||||
implementations of the address space dependent functions for the virtual memory
|
||||
by default, for a reference and convenient use. In near future, we will
|
||||
provide those for physical memory address space.
|
||||
|
||||
|
||||
Can I simply monitor page granularity?
|
||||
======================================
|
||||
|
||||
Yes. You can do so by setting the ``min_nr_regions`` attribute higher than the
|
||||
working set size divided by the page size. Because the monitoring target
|
||||
regions size is forced to be ``>=page size``, the region split will make no
|
||||
effect.
|
||||
30
Documentation/vm/damon/index.rst
Normal file
30
Documentation/vm/damon/index.rst
Normal file
@@ -0,0 +1,30 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==========================
|
||||
DAMON: Data Access MONitor
|
||||
==========================
|
||||
|
||||
DAMON is a data access monitoring framework subsystem for the Linux kernel.
|
||||
The core mechanisms of DAMON (refer to :doc:`design` for the detail) make it
|
||||
|
||||
- *accurate* (the monitoring output is useful enough for DRAM level memory
|
||||
management; It might not appropriate for CPU Cache levels, though),
|
||||
- *light-weight* (the monitoring overhead is low enough to be applied online),
|
||||
and
|
||||
- *scalable* (the upper-bound of the overhead is in constant range regardless
|
||||
of the size of target workloads).
|
||||
|
||||
Using this framework, therefore, the kernel's memory management mechanisms can
|
||||
make advanced decisions. Experimental memory management optimization works
|
||||
that incurring high data accesses monitoring overhead could implemented again.
|
||||
In user space, meanwhile, users who have some special workloads can write
|
||||
personalized applications for better understanding and optimizations of their
|
||||
workloads and systems.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
faq
|
||||
design
|
||||
api
|
||||
plans
|
||||
@@ -32,6 +32,7 @@ descriptions of data structures and algorithms.
|
||||
arch_pgtable_helpers
|
||||
balance
|
||||
cleancache
|
||||
damon/index
|
||||
free_page_reporting
|
||||
frontswap
|
||||
highmem
|
||||
|
||||
15
MAINTAINERS
15
MAINTAINERS
@@ -4526,7 +4526,7 @@ F: .clang-format
|
||||
CLANG/LLVM BUILD SUPPORT
|
||||
M: Nathan Chancellor <nathan@kernel.org>
|
||||
M: Nick Desaulniers <ndesaulniers@google.com>
|
||||
L: clang-built-linux@googlegroups.com
|
||||
L: llvm@lists.linux.dev
|
||||
S: Supported
|
||||
W: https://clangbuiltlinux.github.io/
|
||||
B: https://github.com/ClangBuiltLinux/linux/issues
|
||||
@@ -4542,7 +4542,7 @@ M: Sami Tolvanen <samitolvanen@google.com>
|
||||
M: Kees Cook <keescook@chromium.org>
|
||||
R: Nathan Chancellor <nathan@kernel.org>
|
||||
R: Nick Desaulniers <ndesaulniers@google.com>
|
||||
L: clang-built-linux@googlegroups.com
|
||||
L: llvm@lists.linux.dev
|
||||
S: Supported
|
||||
B: https://github.com/ClangBuiltLinux/linux/issues
|
||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/clang/features
|
||||
@@ -5149,6 +5149,17 @@ F: net/ax25/ax25_out.c
|
||||
F: net/ax25/ax25_timer.c
|
||||
F: net/ax25/sysctl_net_ax25.c
|
||||
|
||||
DATA ACCESS MONITOR
|
||||
M: SeongJae Park <sjpark@amazon.de>
|
||||
L: linux-mm@kvack.org
|
||||
S: Maintained
|
||||
F: Documentation/admin-guide/mm/damon/
|
||||
F: Documentation/vm/damon/
|
||||
F: include/linux/damon.h
|
||||
F: include/trace/events/damon.h
|
||||
F: mm/damon/
|
||||
F: tools/testing/selftests/damon/
|
||||
|
||||
DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER
|
||||
L: netdev@vger.kernel.org
|
||||
S: Orphan
|
||||
|
||||
@@ -889,7 +889,7 @@ config HAVE_SOFTIRQ_ON_OWN_STACK
|
||||
bool
|
||||
help
|
||||
Architecture provides a function to run __do_softirq() on a
|
||||
seperate stack.
|
||||
separate stack.
|
||||
|
||||
config PGTABLE_LEVELS
|
||||
int
|
||||
|
||||
@@ -6,8 +6,8 @@
|
||||
|
||||
/* dummy for now */
|
||||
|
||||
#define map_page_into_agp(page)
|
||||
#define unmap_page_from_agp(page)
|
||||
#define map_page_into_agp(page) do { } while (0)
|
||||
#define unmap_page_from_agp(page) do { } while (0)
|
||||
#define flush_agp_cache() mb()
|
||||
|
||||
/* GATT allocation. Returns/accepts GATT kernel virtual address. */
|
||||
|
||||
@@ -60,6 +60,8 @@ static int __pci_mmap_fits(struct pci_dev *pdev, int num,
|
||||
* @sparse: address space type
|
||||
*
|
||||
* Use the bus mapping routines to map a PCI resource into userspace.
|
||||
*
|
||||
* Return: %0 on success, negative error code otherwise
|
||||
*/
|
||||
static int pci_mmap_resource(struct kobject *kobj,
|
||||
struct bin_attribute *attr,
|
||||
@@ -106,7 +108,7 @@ static int pci_mmap_resource_dense(struct file *filp, struct kobject *kobj,
|
||||
|
||||
/**
|
||||
* pci_remove_resource_files - cleanup resource files
|
||||
* @dev: dev to cleanup
|
||||
* @pdev: pci_dev to cleanup
|
||||
*
|
||||
* If we created resource files for @dev, remove them from sysfs and
|
||||
* free their resources.
|
||||
@@ -221,10 +223,12 @@ static int pci_create_attr(struct pci_dev *pdev, int num)
|
||||
}
|
||||
|
||||
/**
|
||||
* pci_create_resource_files - create resource files in sysfs for @dev
|
||||
* @dev: dev in question
|
||||
* pci_create_resource_files - create resource files in sysfs for @pdev
|
||||
* @pdev: pci_dev in question
|
||||
*
|
||||
* Walk the resources in @dev creating files for each resource available.
|
||||
*
|
||||
* Return: %0 on success, or negative error code
|
||||
*/
|
||||
int pci_create_resource_files(struct pci_dev *pdev)
|
||||
{
|
||||
@@ -296,7 +300,7 @@ int pci_mmap_legacy_page_range(struct pci_bus *bus, struct vm_area_struct *vma,
|
||||
|
||||
/**
|
||||
* pci_adjust_legacy_attr - adjustment of legacy file attributes
|
||||
* @b: bus to create files under
|
||||
* @bus: bus to create files under
|
||||
* @mmap_type: I/O port or memory
|
||||
*
|
||||
* Adjust file name and size for sparse mappings.
|
||||
|
||||
@@ -20,11 +20,6 @@
|
||||
#include <asm/unaligned.h>
|
||||
#include <asm/kprobes.h>
|
||||
|
||||
void __init trap_init(void)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
void die(const char *str, struct pt_regs *regs, unsigned long address)
|
||||
{
|
||||
show_kernel_fault_diag(str, regs, address);
|
||||
|
||||
@@ -56,7 +56,6 @@ CONFIG_ATA=y
|
||||
CONFIG_SATA_MV=y
|
||||
CONFIG_NETDEVICES=y
|
||||
CONFIG_MV643XX_ETH=y
|
||||
CONFIG_INPUT_POLLDEV=y
|
||||
# CONFIG_INPUT_MOUSEDEV is not set
|
||||
CONFIG_INPUT_EVDEV=y
|
||||
# CONFIG_KEYBOARD_ATKBD is not set
|
||||
|
||||
@@ -284,7 +284,6 @@ CONFIG_RT2800USB=m
|
||||
CONFIG_MWIFIEX=m
|
||||
CONFIG_MWIFIEX_SDIO=m
|
||||
CONFIG_INPUT_FF_MEMLESS=m
|
||||
CONFIG_INPUT_POLLDEV=y
|
||||
CONFIG_INPUT_MATRIXKMAP=y
|
||||
CONFIG_INPUT_MOUSEDEV=m
|
||||
CONFIG_INPUT_MOUSEDEV_SCREEN_X=640
|
||||
|
||||
@@ -781,11 +781,6 @@ void abort(void)
|
||||
panic("Oops failed to kill thread");
|
||||
}
|
||||
|
||||
void __init trap_init(void)
|
||||
{
|
||||
return;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_KUSER_HELPERS
|
||||
static void __init kuser_init(void *vectors)
|
||||
{
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user