You've already forked linux-rockchip
mirror of
https://github.com/armbian/linux-rockchip.git
synced 2026-01-06 11:08:10 -08:00
Merge tag 'ASB-2020-02-05_4.19' of https://android.googlesource.com/kernel/common
https://source.android.com/security/bulletin/2020-02-01
CVE-2020-0030
CVE-2019-11599
* tag 'ASB-2020-02-05_4.19': (4206 commits)
UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases
ANDROID: Re-use SUGOV_RT_MAX_FREQ to control uclamp rt behavior
BACKPORT: sched/fair: Make EAS wakeup placement consider uclamp restrictions
BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions
ANDROID: sched/core: Move SchedTune task API into UtilClamp wrappers
ANDROID: sched/core: Add a latency-sensitive flag to uclamp
ANDROID: sched/tune: Move SchedTune cpu API into UtilClamp wrappers
ANDROID: init: kconfig: Only allow sched tune if !uclamp
FROMGIT: sched/core: Fix size of rq::uclamp initialization
FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
FROMGIT: sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
FROMGIT: sched/uclamp: Make uclamp util helpers use and return UL values
FROMGIT: sched/uclamp: Remove uclamp_util()
BACKPORT: sched/rt: Make RT capacity-aware
UPSTREAM: tools headers UAPI: Sync sched.h with the kernel
UPSTREAM: sched/uclamp: Fix overzealous type replacement
UPSTREAM: sched/uclamp: Fix incorrect condition
UPSTREAM: sched/core: Fix compilation error when cgroup not selected
UPSTREAM: sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code
UPSTREAM: sched/uclamp: Always use 'enum uclamp_id' for clamp_id values
...
Conflicts:
drivers/char/random.c
drivers/devfreq/devfreq.c
drivers/gpu/drm/drm_fb_helper.c
drivers/media/i2c/ov2680.c
drivers/media/i2c/ov2685.c
drivers/media/i2c/ov5670.c
drivers/media/i2c/ov5695.c
drivers/media/usb/uvc/uvc_driver.c
drivers/mmc/host/cqhci.c
drivers/spi/spi-rockchip.c
drivers/usb/dwc2/params.c
drivers/usb/dwc3/debugfs.c
drivers/usb/dwc3/gadget.c
drivers/usb/serial/usb_wwan.c
include/linux/clk-provider.h
include/linux/mfd/rk808.h
kernel/cpu.c
sound/usb/quirks.c
- Export symbol mm_trace_rss_stat on mm/memory.c for GPU drivers.
- Fix sound/usb/pcm.c for SNDRV_PCM_TRIGGER_SUSPEND.
- Enable DEBUG_FS which is not selected by TRACING.
- Disable of_devlink which broken boot. of_devlink is enabled by commit
ba3aa33b8f ("ANDROID: of: property: Enable of_devlink by default").
- Add CLK_DONT_HOLD_STATE and CLK_KEEP_REQ_RATE to clk_flags
on drivers/clk/clk.c.
Change-Id: I500ca1bbc735753f9c8251ed2ac8ad757d5a24a4
Signed-off-by: Tao Huang <huangtao@rock-chips.com>
This commit is contained in:
@@ -199,7 +199,7 @@ Description:
|
||||
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_positionrelative_x_raw
|
||||
What: /sys/bus/iio/devices/iio:deviceX/in_positionrelative_y_raw
|
||||
KernelVersion: 4.18
|
||||
KernelVersion: 4.19
|
||||
Contact: linux-iio@vger.kernel.org
|
||||
Description:
|
||||
Relative position in direction x or y on a pad (may be
|
||||
|
||||
@@ -4,7 +4,7 @@ KernelVersion: 3.10
|
||||
Contact: Samuel Ortiz <sameo@linux.intel.com>
|
||||
linux-mei@linux.intel.com
|
||||
Description: Stores the same MODALIAS value emitted by uevent
|
||||
Format: mei:<mei device name>:<device uuid>:
|
||||
Format: mei:<mei device name>:<device uuid>:<protocol version>
|
||||
|
||||
What: /sys/bus/mei/devices/.../name
|
||||
Date: May 2015
|
||||
|
||||
@@ -478,6 +478,8 @@ What: /sys/devices/system/cpu/vulnerabilities
|
||||
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
|
||||
/sys/devices/system/cpu/vulnerabilities/l1tf
|
||||
/sys/devices/system/cpu/vulnerabilities/mds
|
||||
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
|
||||
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
|
||||
Date: January 2018
|
||||
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
|
||||
Description: Information about CPU vulnerabilities
|
||||
|
||||
@@ -31,6 +31,12 @@ Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
|
||||
Description:
|
||||
Controls the issue rate of segment discard commands.
|
||||
|
||||
What: /sys/fs/f2fs/<disk>/max_blkaddr
|
||||
Date: November 2019
|
||||
Contact: "Ramon Pantin" <pantin@google.com>
|
||||
Description:
|
||||
Shows first block address of MAIN area.
|
||||
|
||||
What: /sys/fs/f2fs/<disk>/ipu_policy
|
||||
Date: November 2013
|
||||
Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
|
||||
|
||||
27
Documentation/ABI/testing/sysfs-kernel-ion
Normal file
27
Documentation/ABI/testing/sysfs-kernel-ion
Normal file
@@ -0,0 +1,27 @@
|
||||
What: /sys/kernel/ion
|
||||
Date: Dec 2019
|
||||
KernelVersion: 4.14.158
|
||||
Contact: Suren Baghdasaryan <surenb@google.com>,
|
||||
Sandeep Patil <sspatil@google.com>
|
||||
Description:
|
||||
The /sys/kernel/ion directory contains a snapshot of the
|
||||
internal state of ION memory heaps and pools.
|
||||
Users: kernel memory tuning tools
|
||||
|
||||
What: /sys/kernel/ion/total_heaps_kb
|
||||
Date: Dec 2019
|
||||
KernelVersion: 4.14.158
|
||||
Contact: Suren Baghdasaryan <surenb@google.com>,
|
||||
Sandeep Patil <sspatil@google.com>
|
||||
Description:
|
||||
The total_heaps_kb file is read-only and specifies how much
|
||||
memory in Kb is allocated to ION heaps.
|
||||
|
||||
What: /sys/kernel/ion/total_pools_kb
|
||||
Date: Dec 2019
|
||||
KernelVersion: 4.14.158
|
||||
Contact: Suren Baghdasaryan <surenb@google.com>,
|
||||
Sandeep Patil <sspatil@google.com>
|
||||
Description:
|
||||
The total_pools_kb file is read-only and specifies how much
|
||||
memory in Kb is allocated to ION pools.
|
||||
@@ -694,6 +694,12 @@ Conventions
|
||||
informational files on the root cgroup which end up showing global
|
||||
information available elsewhere shouldn't exist.
|
||||
|
||||
- The default time unit is microseconds. If a different unit is ever
|
||||
used, an explicit unit suffix must be present.
|
||||
|
||||
- A parts-per quantity should use a percentage decimal with at least
|
||||
two digit fractional part - e.g. 13.40.
|
||||
|
||||
- If a controller implements weight based resource distribution, its
|
||||
interface file should be named "weight" and have the range [1,
|
||||
10000] with 100 as the default. The values are chosen to allow
|
||||
@@ -907,6 +913,13 @@ controller implements weight and absolute bandwidth limit models for
|
||||
normal scheduling policy and absolute bandwidth allocation model for
|
||||
realtime scheduling policy.
|
||||
|
||||
In all the above models, cycles distribution is defined only on a temporal
|
||||
base and it does not account for the frequency at which tasks are executed.
|
||||
The (optional) utilization clamping support allows to hint the schedutil
|
||||
cpufreq governor about the minimum desired frequency which should always be
|
||||
provided by a CPU, as well as the maximum desired frequency, which should not
|
||||
be exceeded by a CPU.
|
||||
|
||||
WARNING: cgroup2 doesn't yet support control of realtime processes and
|
||||
the cpu controller can only be enabled when all RT processes are in
|
||||
the root cgroup. Be aware that system management software may already
|
||||
@@ -972,6 +985,33 @@ All time durations are in microseconds.
|
||||
Shows pressure stall information for CPU. See
|
||||
Documentation/accounting/psi.txt for details.
|
||||
|
||||
cpu.uclamp.min
|
||||
A read-write single value file which exists on non-root cgroups.
|
||||
The default is "0", i.e. no utilization boosting.
|
||||
|
||||
The requested minimum utilization (protection) as a percentage
|
||||
rational number, e.g. 12.34 for 12.34%.
|
||||
|
||||
This interface allows reading and setting minimum utilization clamp
|
||||
values similar to the sched_setattr(2). This minimum utilization
|
||||
value is used to clamp the task specific minimum utilization clamp.
|
||||
|
||||
The requested minimum utilization (protection) is always capped by
|
||||
the current value for the maximum utilization (limit), i.e.
|
||||
`cpu.uclamp.max`.
|
||||
|
||||
cpu.uclamp.max
|
||||
A read-write single value file which exists on non-root cgroups.
|
||||
The default is "max". i.e. no utilization capping
|
||||
|
||||
The requested maximum utilization (limit) as a percentage rational
|
||||
number, e.g. 98.76 for 98.76%.
|
||||
|
||||
This interface allows reading and setting maximum utilization clamp
|
||||
values similar to the sched_setattr(2). This maximum utilization
|
||||
value is used to clamp the task specific maximum utilization clamp.
|
||||
|
||||
|
||||
|
||||
Memory
|
||||
------
|
||||
|
||||
@@ -12,3 +12,5 @@ are configurable at compile, boot or run time.
|
||||
spectre
|
||||
l1tf
|
||||
mds
|
||||
tsx_async_abort
|
||||
multihit.rst
|
||||
|
||||
@@ -265,8 +265,11 @@ time with the option "mds=". The valid arguments for this option are:
|
||||
|
||||
============ =============================================================
|
||||
|
||||
Not specifying this option is equivalent to "mds=full".
|
||||
|
||||
Not specifying this option is equivalent to "mds=full". For processors
|
||||
that are affected by both TAA (TSX Asynchronous Abort) and MDS,
|
||||
specifying just "mds=off" without an accompanying "tsx_async_abort=off"
|
||||
will have no effect as the same mitigation is used for both
|
||||
vulnerabilities.
|
||||
|
||||
Mitigation selection guide
|
||||
--------------------------
|
||||
|
||||
163
Documentation/admin-guide/hw-vuln/multihit.rst
Normal file
163
Documentation/admin-guide/hw-vuln/multihit.rst
Normal file
@@ -0,0 +1,163 @@
|
||||
iTLB multihit
|
||||
=============
|
||||
|
||||
iTLB multihit is an erratum where some processors may incur a machine check
|
||||
error, possibly resulting in an unrecoverable CPU lockup, when an
|
||||
instruction fetch hits multiple entries in the instruction TLB. This can
|
||||
occur when the page size is changed along with either the physical address
|
||||
or cache type. A malicious guest running on a virtualized system can
|
||||
exploit this erratum to perform a denial of service attack.
|
||||
|
||||
|
||||
Affected processors
|
||||
-------------------
|
||||
|
||||
Variations of this erratum are present on most Intel Core and Xeon processor
|
||||
models. The erratum is not present on:
|
||||
|
||||
- non-Intel processors
|
||||
|
||||
- Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)
|
||||
|
||||
- Intel processors that have the PSCHANGE_MC_NO bit set in the
|
||||
IA32_ARCH_CAPABILITIES MSR.
|
||||
|
||||
|
||||
Related CVEs
|
||||
------------
|
||||
|
||||
The following CVE entry is related to this issue:
|
||||
|
||||
============== =================================================
|
||||
CVE-2018-12207 Machine Check Error Avoidance on Page Size Change
|
||||
============== =================================================
|
||||
|
||||
|
||||
Problem
|
||||
-------
|
||||
|
||||
Privileged software, including OS and virtual machine managers (VMM), are in
|
||||
charge of memory management. A key component in memory management is the control
|
||||
of the page tables. Modern processors use virtual memory, a technique that creates
|
||||
the illusion of a very large memory for processors. This virtual space is split
|
||||
into pages of a given size. Page tables translate virtual addresses to physical
|
||||
addresses.
|
||||
|
||||
To reduce latency when performing a virtual to physical address translation,
|
||||
processors include a structure, called TLB, that caches recent translations.
|
||||
There are separate TLBs for instruction (iTLB) and data (dTLB).
|
||||
|
||||
Under this errata, instructions are fetched from a linear address translated
|
||||
using a 4 KB translation cached in the iTLB. Privileged software modifies the
|
||||
paging structure so that the same linear address using large page size (2 MB, 4
|
||||
MB, 1 GB) with a different physical address or memory type. After the page
|
||||
structure modification but before the software invalidates any iTLB entries for
|
||||
the linear address, a code fetch that happens on the same linear address may
|
||||
cause a machine-check error which can result in a system hang or shutdown.
|
||||
|
||||
|
||||
Attack scenarios
|
||||
----------------
|
||||
|
||||
Attacks against the iTLB multihit erratum can be mounted from malicious
|
||||
guests in a virtualized system.
|
||||
|
||||
|
||||
iTLB multihit system information
|
||||
--------------------------------
|
||||
|
||||
The Linux kernel provides a sysfs interface to enumerate the current iTLB
|
||||
multihit status of the system:whether the system is vulnerable and which
|
||||
mitigations are active. The relevant sysfs file is:
|
||||
|
||||
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
|
||||
|
||||
The possible values in this file are:
|
||||
|
||||
.. list-table::
|
||||
|
||||
* - Not affected
|
||||
- The processor is not vulnerable.
|
||||
* - KVM: Mitigation: Split huge pages
|
||||
- Software changes mitigate this issue.
|
||||
* - KVM: Vulnerable
|
||||
- The processor is vulnerable, but no mitigation enabled
|
||||
|
||||
|
||||
Enumeration of the erratum
|
||||
--------------------------------
|
||||
|
||||
A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr
|
||||
and will be set on CPU's which are mitigated against this issue.
|
||||
|
||||
======================================= =========== ===============================
|
||||
IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model
|
||||
IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model
|
||||
IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable
|
||||
======================================= =========== ===============================
|
||||
|
||||
|
||||
Mitigation mechanism
|
||||
-------------------------
|
||||
|
||||
This erratum can be mitigated by restricting the use of large page sizes to
|
||||
non-executable pages. This forces all iTLB entries to be 4K, and removes
|
||||
the possibility of multiple hits.
|
||||
|
||||
In order to mitigate the vulnerability, KVM initially marks all huge pages
|
||||
as non-executable. If the guest attempts to execute in one of those pages,
|
||||
the page is broken down into 4K pages, which are then marked executable.
|
||||
|
||||
If EPT is disabled or not available on the host, KVM is in control of TLB
|
||||
flushes and the problematic situation cannot happen. However, the shadow
|
||||
EPT paging mechanism used by nested virtualization is vulnerable, because
|
||||
the nested guest can trigger multiple iTLB hits by modifying its own
|
||||
(non-nested) page tables. For simplicity, KVM will make large pages
|
||||
non-executable in all shadow paging modes.
|
||||
|
||||
Mitigation control on the kernel command line and KVM - module parameter
|
||||
------------------------------------------------------------------------
|
||||
|
||||
The KVM hypervisor mitigation mechanism for marking huge pages as
|
||||
non-executable can be controlled with a module parameter "nx_huge_pages=".
|
||||
The kernel command line allows to control the iTLB multihit mitigations at
|
||||
boot time with the option "kvm.nx_huge_pages=".
|
||||
|
||||
The valid arguments for these options are:
|
||||
|
||||
========== ================================================================
|
||||
force Mitigation is enabled. In this case, the mitigation implements
|
||||
non-executable huge pages in Linux kernel KVM module. All huge
|
||||
pages in the EPT are marked as non-executable.
|
||||
If a guest attempts to execute in one of those pages, the page is
|
||||
broken down into 4K pages, which are then marked executable.
|
||||
|
||||
off Mitigation is disabled.
|
||||
|
||||
auto Enable mitigation only if the platform is affected and the kernel
|
||||
was not booted with the "mitigations=off" command line parameter.
|
||||
This is the default option.
|
||||
========== ================================================================
|
||||
|
||||
|
||||
Mitigation selection guide
|
||||
--------------------------
|
||||
|
||||
1. No virtualization in use
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The system is protected by the kernel unconditionally and no further
|
||||
action is required.
|
||||
|
||||
2. Virtualization with trusted guests
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If the guest comes from a trusted source, you may assume that the guest will
|
||||
not attempt to maliciously exploit these errata and no further action is
|
||||
required.
|
||||
|
||||
3. Virtualization with untrusted guests
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
If the guest comes from an untrusted source, the guest host kernel will need
|
||||
to apply iTLB multihit mitigation via the kernel command line or kvm
|
||||
module parameter.
|
||||
279
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
Normal file
279
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
Normal file
@@ -0,0 +1,279 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
TAA - TSX Asynchronous Abort
|
||||
======================================
|
||||
|
||||
TAA is a hardware vulnerability that allows unprivileged speculative access to
|
||||
data which is available in various CPU internal buffers by using asynchronous
|
||||
aborts within an Intel TSX transactional region.
|
||||
|
||||
Affected processors
|
||||
-------------------
|
||||
|
||||
This vulnerability only affects Intel processors that support Intel
|
||||
Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
|
||||
is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
|
||||
(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
|
||||
also mitigate against TAA.
|
||||
|
||||
Whether a processor is affected or not can be read out from the TAA
|
||||
vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
|
||||
|
||||
Related CVEs
|
||||
------------
|
||||
|
||||
The following CVE entry is related to this TAA issue:
|
||||
|
||||
============== ===== ===================================================
|
||||
CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
|
||||
microprocessors utilizing speculative execution may
|
||||
allow an authenticated user to potentially enable
|
||||
information disclosure via a side channel with
|
||||
local access.
|
||||
============== ===== ===================================================
|
||||
|
||||
Problem
|
||||
-------
|
||||
|
||||
When performing store, load or L1 refill operations, processors write
|
||||
data into temporary microarchitectural structures (buffers). The data in
|
||||
those buffers can be forwarded to load operations as an optimization.
|
||||
|
||||
Intel TSX is an extension to the x86 instruction set architecture that adds
|
||||
hardware transactional memory support to improve performance of multi-threaded
|
||||
software. TSX lets the processor expose and exploit concurrency hidden in an
|
||||
application due to dynamically avoiding unnecessary synchronization.
|
||||
|
||||
TSX supports atomic memory transactions that are either committed (success) or
|
||||
aborted. During an abort, operations that happened within the transactional region
|
||||
are rolled back. An asynchronous abort takes place, among other options, when a
|
||||
different thread accesses a cache line that is also used within the transactional
|
||||
region when that access might lead to a data race.
|
||||
|
||||
Immediately after an uncompleted asynchronous abort, certain speculatively
|
||||
executed loads may read data from those internal buffers and pass it to dependent
|
||||
operations. This can be then used to infer the value via a cache side channel
|
||||
attack.
|
||||
|
||||
Because the buffers are potentially shared between Hyper-Threads cross
|
||||
Hyper-Thread attacks are possible.
|
||||
|
||||
The victim of a malicious actor does not need to make use of TSX. Only the
|
||||
attacker needs to begin a TSX transaction and raise an asynchronous abort
|
||||
which in turn potenitally leaks data stored in the buffers.
|
||||
|
||||
More detailed technical information is available in the TAA specific x86
|
||||
architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
|
||||
|
||||
|
||||
Attack scenarios
|
||||
----------------
|
||||
|
||||
Attacks against the TAA vulnerability can be implemented from unprivileged
|
||||
applications running on hosts or guests.
|
||||
|
||||
As for MDS, the attacker has no control over the memory addresses that can
|
||||
be leaked. Only the victim is responsible for bringing data to the CPU. As
|
||||
a result, the malicious actor has to sample as much data as possible and
|
||||
then postprocess it to try to infer any useful information from it.
|
||||
|
||||
A potential attacker only has read access to the data. Also, there is no direct
|
||||
privilege escalation by using this technique.
|
||||
|
||||
|
||||
.. _tsx_async_abort_sys_info:
|
||||
|
||||
TAA system information
|
||||
-----------------------
|
||||
|
||||
The Linux kernel provides a sysfs interface to enumerate the current TAA status
|
||||
of mitigated systems. The relevant sysfs file is:
|
||||
|
||||
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
|
||||
|
||||
The possible values in this file are:
|
||||
|
||||
.. list-table::
|
||||
|
||||
* - 'Vulnerable'
|
||||
- The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
|
||||
* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
|
||||
- The system tries to clear the buffers but the microcode might not support the operation.
|
||||
* - 'Mitigation: Clear CPU buffers'
|
||||
- The microcode has been updated to clear the buffers. TSX is still enabled.
|
||||
* - 'Mitigation: TSX disabled'
|
||||
- TSX is disabled.
|
||||
* - 'Not affected'
|
||||
- The CPU is not affected by this issue.
|
||||
|
||||
.. _ucode_needed:
|
||||
|
||||
Best effort mitigation mode
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If the processor is vulnerable, but the availability of the microcode-based
|
||||
mitigation mechanism is not advertised via CPUID the kernel selects a best
|
||||
effort mitigation mode. This mode invokes the mitigation instructions
|
||||
without a guarantee that they clear the CPU buffers.
|
||||
|
||||
This is done to address virtualization scenarios where the host has the
|
||||
microcode update applied, but the hypervisor is not yet updated to expose the
|
||||
CPUID to the guest. If the host has updated microcode the protection takes
|
||||
effect; otherwise a few CPU cycles are wasted pointlessly.
|
||||
|
||||
The state in the tsx_async_abort sysfs file reflects this situation
|
||||
accordingly.
|
||||
|
||||
|
||||
Mitigation mechanism
|
||||
--------------------
|
||||
|
||||
The kernel detects the affected CPUs and the presence of the microcode which is
|
||||
required. If a CPU is affected and the microcode is available, then the kernel
|
||||
enables the mitigation by default.
|
||||
|
||||
|
||||
The mitigation can be controlled at boot time via a kernel command line option.
|
||||
See :ref:`taa_mitigation_control_command_line`.
|
||||
|
||||
.. _virt_mechanism:
|
||||
|
||||
Virtualization mitigation
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Affected systems where the host has TAA microcode and TAA is mitigated by
|
||||
having disabled TSX previously, are not vulnerable regardless of the status
|
||||
of the VMs.
|
||||
|
||||
In all other cases, if the host either does not have the TAA microcode or
|
||||
the kernel is not mitigated, the system might be vulnerable.
|
||||
|
||||
|
||||
.. _taa_mitigation_control_command_line:
|
||||
|
||||
Mitigation control on the kernel command line
|
||||
---------------------------------------------
|
||||
|
||||
The kernel command line allows to control the TAA mitigations at boot time with
|
||||
the option "tsx_async_abort=". The valid arguments for this option are:
|
||||
|
||||
============ =============================================================
|
||||
off This option disables the TAA mitigation on affected platforms.
|
||||
If the system has TSX enabled (see next parameter) and the CPU
|
||||
is affected, the system is vulnerable.
|
||||
|
||||
full TAA mitigation is enabled. If TSX is enabled, on an affected
|
||||
system it will clear CPU buffers on ring transitions. On
|
||||
systems which are MDS-affected and deploy MDS mitigation,
|
||||
TAA is also mitigated. Specifying this option on those
|
||||
systems will have no effect.
|
||||
|
||||
full,nosmt The same as tsx_async_abort=full, with SMT disabled on
|
||||
vulnerable CPUs that have TSX enabled. This is the complete
|
||||
mitigation. When TSX is disabled, SMT is not disabled because
|
||||
CPU is not vulnerable to cross-thread TAA attacks.
|
||||
============ =============================================================
|
||||
|
||||
Not specifying this option is equivalent to "tsx_async_abort=full". For
|
||||
processors that are affected by both TAA and MDS, specifying just
|
||||
"tsx_async_abort=off" without an accompanying "mds=off" will have no
|
||||
effect as the same mitigation is used for both vulnerabilities.
|
||||
|
||||
The kernel command line also allows to control the TSX feature using the
|
||||
parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
|
||||
to control the TSX feature and the enumeration of the TSX feature bits (RTM
|
||||
and HLE) in CPUID.
|
||||
|
||||
The valid options are:
|
||||
|
||||
============ =============================================================
|
||||
off Disables TSX on the system.
|
||||
|
||||
Note that this option takes effect only on newer CPUs which are
|
||||
not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
|
||||
and which get the new IA32_TSX_CTRL MSR through a microcode
|
||||
update. This new MSR allows for the reliable deactivation of
|
||||
the TSX functionality.
|
||||
|
||||
on Enables TSX.
|
||||
|
||||
Although there are mitigations for all known security
|
||||
vulnerabilities, TSX has been known to be an accelerator for
|
||||
several previous speculation-related CVEs, and so there may be
|
||||
unknown security risks associated with leaving it enabled.
|
||||
|
||||
auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
|
||||
on the system.
|
||||
============ =============================================================
|
||||
|
||||
Not specifying this option is equivalent to "tsx=off".
|
||||
|
||||
The following combinations of the "tsx_async_abort" and "tsx" are possible. For
|
||||
affected platforms tsx=auto is equivalent to tsx=off and the result will be:
|
||||
|
||||
========= ========================== =========================================
|
||||
tsx=on tsx_async_abort=full The system will use VERW to clear CPU
|
||||
buffers. Cross-thread attacks are still
|
||||
possible on SMT machines.
|
||||
tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
|
||||
mitigated.
|
||||
tsx=on tsx_async_abort=off The system is vulnerable.
|
||||
tsx=off tsx_async_abort=full TSX might be disabled if microcode
|
||||
provides a TSX control MSR. If so,
|
||||
system is not vulnerable.
|
||||
tsx=off tsx_async_abort=full,nosmt Ditto
|
||||
tsx=off tsx_async_abort=off ditto
|
||||
========= ========================== =========================================
|
||||
|
||||
|
||||
For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
|
||||
buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
|
||||
"tsx" command line argument has no effect.
|
||||
|
||||
For the affected platforms below table indicates the mitigation status for the
|
||||
combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
|
||||
and TSX_CTRL_MSR.
|
||||
|
||||
======= ========= ============= ========================================
|
||||
MDS_NO MD_CLEAR TSX_CTRL_MSR Status
|
||||
======= ========= ============= ========================================
|
||||
0 0 0 Vulnerable (needs microcode)
|
||||
0 1 0 MDS and TAA mitigated via VERW
|
||||
1 1 0 MDS fixed, TAA vulnerable if TSX enabled
|
||||
because MD_CLEAR has no meaning and
|
||||
VERW is not guaranteed to clear buffers
|
||||
1 X 1 MDS fixed, TAA can be mitigated by
|
||||
VERW or TSX_CTRL_MSR
|
||||
======= ========= ============= ========================================
|
||||
|
||||
Mitigation selection guide
|
||||
--------------------------
|
||||
|
||||
1. Trusted userspace and guests
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If all user space applications are from a trusted source and do not execute
|
||||
untrusted code which is supplied externally, then the mitigation can be
|
||||
disabled. The same applies to virtualized environments with trusted guests.
|
||||
|
||||
|
||||
2. Untrusted userspace and guests
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
If there are untrusted applications or guests on the system, enabling TSX
|
||||
might allow a malicious actor to leak data from the host or from other
|
||||
processes running on the same physical core.
|
||||
|
||||
If the microcode is available and the TSX is disabled on the host, attacks
|
||||
are prevented in a virtualized environment as well, even if the VMs do not
|
||||
explicitly enable the mitigation.
|
||||
|
||||
|
||||
.. _taa_default_mitigations:
|
||||
|
||||
Default mitigations
|
||||
-------------------
|
||||
|
||||
The kernel's default action for vulnerable processors is:
|
||||
|
||||
- Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
|
||||
@@ -113,7 +113,7 @@
|
||||
the GPE dispatcher.
|
||||
This facility can be used to prevent such uncontrolled
|
||||
GPE floodings.
|
||||
Format: <int>
|
||||
Format: <byte>
|
||||
|
||||
acpi_no_auto_serialize [HW,ACPI]
|
||||
Disable auto-serialization of AML methods
|
||||
@@ -1955,6 +1955,12 @@
|
||||
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
|
||||
the default is off.
|
||||
|
||||
kpti= [ARM64] Control page table isolation of user
|
||||
and kernel address spaces.
|
||||
Default: enabled on cores which need mitigation.
|
||||
0: force disabled
|
||||
1: force enabled
|
||||
|
||||
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
|
||||
Default is 0 (don't ignore, but inject #GP)
|
||||
|
||||
@@ -1965,6 +1971,25 @@
|
||||
KVM MMU at runtime.
|
||||
Default is 0 (off)
|
||||
|
||||
kvm.nx_huge_pages=
|
||||
[KVM] Controls the software workaround for the
|
||||
X86_BUG_ITLB_MULTIHIT bug.
|
||||
force : Always deploy workaround.
|
||||
off : Never deploy workaround.
|
||||
auto : Deploy workaround based on the presence of
|
||||
X86_BUG_ITLB_MULTIHIT.
|
||||
|
||||
Default is 'auto'.
|
||||
|
||||
If the software workaround is enabled for the host,
|
||||
guests do need not to enable it for nested guests.
|
||||
|
||||
kvm.nx_huge_pages_recovery_ratio=
|
||||
[KVM] Controls how many 4KiB pages are periodically zapped
|
||||
back to huge pages. 0 disables the recovery, otherwise if
|
||||
the value is N KVM will zap 1/Nth of the 4KiB pages every
|
||||
minute. The default is 60.
|
||||
|
||||
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
|
||||
Default is 1 (enabled)
|
||||
|
||||
@@ -2349,6 +2374,12 @@
|
||||
SMT on vulnerable CPUs
|
||||
off - Unconditionally disable MDS mitigation
|
||||
|
||||
On TAA-affected machines, mds=off can be prevented by
|
||||
an active TAA mitigation as both vulnerabilities are
|
||||
mitigated with the same mechanism so in order to disable
|
||||
this mitigation, you need to specify tsx_async_abort=off
|
||||
too.
|
||||
|
||||
Not specifying this option is equivalent to
|
||||
mds=full.
|
||||
|
||||
@@ -2532,6 +2563,13 @@
|
||||
ssbd=force-off [ARM64]
|
||||
l1tf=off [X86]
|
||||
mds=off [X86]
|
||||
tsx_async_abort=off [X86]
|
||||
kvm.nx_huge_pages=off [X86]
|
||||
|
||||
Exceptions:
|
||||
This does not have any effect on
|
||||
kvm.nx_huge_pages when
|
||||
kvm.nx_huge_pages=force.
|
||||
|
||||
auto (default)
|
||||
Mitigate all CPU vulnerabilities, but leave SMT
|
||||
@@ -2547,6 +2585,7 @@
|
||||
be fully mitigated, even if it means losing SMT.
|
||||
Equivalent to: l1tf=flush,nosmt [X86]
|
||||
mds=full,nosmt [X86]
|
||||
tsx_async_abort=full,nosmt [X86]
|
||||
|
||||
mminit_loglevel=
|
||||
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
|
||||
@@ -4709,6 +4748,76 @@
|
||||
marks the TSC unconditionally unstable at bootup and
|
||||
avoids any further wobbles once the TSC watchdog notices.
|
||||
|
||||
tsx= [X86] Control Transactional Synchronization
|
||||
Extensions (TSX) feature in Intel processors that
|
||||
support TSX control.
|
||||
|
||||
This parameter controls the TSX feature. The options are:
|
||||
|
||||
on - Enable TSX on the system. Although there are
|
||||
mitigations for all known security vulnerabilities,
|
||||
TSX has been known to be an accelerator for
|
||||
several previous speculation-related CVEs, and
|
||||
so there may be unknown security risks associated
|
||||
with leaving it enabled.
|
||||
|
||||
off - Disable TSX on the system. (Note that this
|
||||
option takes effect only on newer CPUs which are
|
||||
not vulnerable to MDS, i.e., have
|
||||
MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get
|
||||
the new IA32_TSX_CTRL MSR through a microcode
|
||||
update. This new MSR allows for the reliable
|
||||
deactivation of the TSX functionality.)
|
||||
|
||||
auto - Disable TSX if X86_BUG_TAA is present,
|
||||
otherwise enable TSX on the system.
|
||||
|
||||
Not specifying this option is equivalent to tsx=off.
|
||||
|
||||
See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
|
||||
for more details.
|
||||
|
||||
tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
|
||||
Abort (TAA) vulnerability.
|
||||
|
||||
Similar to Micro-architectural Data Sampling (MDS)
|
||||
certain CPUs that support Transactional
|
||||
Synchronization Extensions (TSX) are vulnerable to an
|
||||
exploit against CPU internal buffers which can forward
|
||||
information to a disclosure gadget under certain
|
||||
conditions.
|
||||
|
||||
In vulnerable processors, the speculatively forwarded
|
||||
data can be used in a cache side channel attack, to
|
||||
access data to which the attacker does not have direct
|
||||
access.
|
||||
|
||||
This parameter controls the TAA mitigation. The
|
||||
options are:
|
||||
|
||||
full - Enable TAA mitigation on vulnerable CPUs
|
||||
if TSX is enabled.
|
||||
|
||||
full,nosmt - Enable TAA mitigation and disable SMT on
|
||||
vulnerable CPUs. If TSX is disabled, SMT
|
||||
is not disabled because CPU is not
|
||||
vulnerable to cross-thread TAA attacks.
|
||||
off - Unconditionally disable TAA mitigation
|
||||
|
||||
On MDS-affected machines, tsx_async_abort=off can be
|
||||
prevented by an active MDS mitigation as both vulnerabilities
|
||||
are mitigated with the same mechanism so in order to disable
|
||||
this mitigation, you need to specify mds=off too.
|
||||
|
||||
Not specifying this option is equivalent to
|
||||
tsx_async_abort=full. On CPUs which are MDS affected
|
||||
and deploy MDS mitigation, TAA mitigation is not
|
||||
required and doesn't provide any additional
|
||||
mitigation.
|
||||
|
||||
For details see:
|
||||
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
|
||||
|
||||
turbografx.map[2|3]= [HW,JOY]
|
||||
TurboGraFX parallel port interface
|
||||
Format:
|
||||
@@ -4857,13 +4966,13 @@
|
||||
Flags is a set of characters, each corresponding
|
||||
to a common usb-storage quirk flag as follows:
|
||||
a = SANE_SENSE (collect more than 18 bytes
|
||||
of sense data);
|
||||
of sense data, not on uas);
|
||||
b = BAD_SENSE (don't collect more than 18
|
||||
bytes of sense data);
|
||||
bytes of sense data, not on uas);
|
||||
c = FIX_CAPACITY (decrease the reported
|
||||
device capacity by one sector);
|
||||
d = NO_READ_DISC_INFO (don't use
|
||||
READ_DISC_INFO command);
|
||||
READ_DISC_INFO command, not on uas);
|
||||
e = NO_READ_CAPACITY_16 (don't use
|
||||
READ_CAPACITY_16 command);
|
||||
f = NO_REPORT_OPCODES (don't use report opcodes
|
||||
@@ -4878,17 +4987,18 @@
|
||||
j = NO_REPORT_LUNS (don't use report luns
|
||||
command, uas only);
|
||||
l = NOT_LOCKABLE (don't try to lock and
|
||||
unlock ejectable media);
|
||||
unlock ejectable media, not on uas);
|
||||
m = MAX_SECTORS_64 (don't transfer more
|
||||
than 64 sectors = 32 KB at a time);
|
||||
than 64 sectors = 32 KB at a time,
|
||||
not on uas);
|
||||
n = INITIAL_READ10 (force a retry of the
|
||||
initial READ(10) command);
|
||||
initial READ(10) command, not on uas);
|
||||
o = CAPACITY_OK (accept the capacity
|
||||
reported by the device);
|
||||
reported by the device, not on uas);
|
||||
p = WRITE_CACHE (the device cache is ON
|
||||
by default);
|
||||
by default, not on uas);
|
||||
r = IGNORE_RESIDUE (the device reports
|
||||
bogus residue values);
|
||||
bogus residue values, not on uas);
|
||||
s = SINGLE_LUN (the device has only one
|
||||
Logical Unit);
|
||||
t = NO_ATA_1X (don't allow ATA(12) and ATA(16)
|
||||
@@ -4897,7 +5007,8 @@
|
||||
w = NO_WP_DETECT (don't test whether the
|
||||
medium is write-protected).
|
||||
y = ALWAYS_SYNC (issue a SYNCHRONIZE_CACHE
|
||||
even if the device claims no cache)
|
||||
even if the device claims no cache,
|
||||
not on uas)
|
||||
Example: quirks=0419:aaf5:rl,0421:0433:rc
|
||||
|
||||
user_debug= [KNL,ARM]
|
||||
@@ -5136,6 +5247,10 @@
|
||||
the unplug protocol
|
||||
never -- do not unplug even if version check succeeds
|
||||
|
||||
xen_legacy_crash [X86,XEN]
|
||||
Crash from Xen panic notifier, without executing late
|
||||
panic() code such as dumping handler.
|
||||
|
||||
xen_nopvspin [X86,XEN]
|
||||
Disables the ticketlock slowpath using Xen PV
|
||||
optimizations.
|
||||
|
||||
183
Documentation/block/inline-encryption.rst
Normal file
183
Documentation/block/inline-encryption.rst
Normal file
@@ -0,0 +1,183 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=================
|
||||
Inline Encryption
|
||||
=================
|
||||
|
||||
Objective
|
||||
=========
|
||||
|
||||
We want to support inline encryption (IE) in the kernel.
|
||||
To allow for testing, we also want a crypto API fallback when actual
|
||||
IE hardware is absent. We also want IE to work with layered devices
|
||||
like dm and loopback (i.e. we want to be able to use the IE hardware
|
||||
of the underlying devices if present, or else fall back to crypto API
|
||||
en/decryption).
|
||||
|
||||
|
||||
Constraints and notes
|
||||
=====================
|
||||
|
||||
- IE hardware have a limited number of "keyslots" that can be programmed
|
||||
with an encryption context (key, algorithm, data unit size, etc.) at any time.
|
||||
One can specify a keyslot in a data request made to the device, and the
|
||||
device will en/decrypt the data using the encryption context programmed into
|
||||
that specified keyslot. When possible, we want to make multiple requests with
|
||||
the same encryption context share the same keyslot.
|
||||
|
||||
- We need a way for filesystems to specify an encryption context to use for
|
||||
en/decrypting a struct bio, and a device driver (like UFS) needs to be able
|
||||
to use that encryption context when it processes the bio.
|
||||
|
||||
- We need a way for device drivers to expose their capabilities in a unified
|
||||
way to the upper layers.
|
||||
|
||||
|
||||
Design
|
||||
======
|
||||
|
||||
We add a struct bio_crypt_ctx to struct bio that can represent an
|
||||
encryption context, because we need to be able to pass this encryption
|
||||
context from the FS layer to the device driver to act upon.
|
||||
|
||||
While IE hardware works on the notion of keyslots, the FS layer has no
|
||||
knowledge of keyslots - it simply wants to specify an encryption context to
|
||||
use while en/decrypting a bio.
|
||||
|
||||
We introduce a keyslot manager (KSM) that handles the translation from
|
||||
encryption contexts specified by the FS to keyslots on the IE hardware.
|
||||
This KSM also serves as the way IE hardware can expose their capabilities to
|
||||
upper layers. The generic mode of operation is: each device driver that wants
|
||||
to support IE will construct a KSM and set it up in its struct request_queue.
|
||||
Upper layers that want to use IE on this device can then use this KSM in
|
||||
the device's struct request_queue to translate an encryption context into
|
||||
a keyslot. The presence of the KSM in the request queue shall be used to mean
|
||||
that the device supports IE.
|
||||
|
||||
On the device driver end of the interface, the device driver needs to tell the
|
||||
KSM how to actually manipulate the IE hardware in the device to do things like
|
||||
programming the crypto key into the IE hardware into a particular keyslot. All
|
||||
this is achieved through the :c:type:`struct keyslot_mgmt_ll_ops` that the
|
||||
device driver passes to the KSM when creating it.
|
||||
|
||||
It uses refcounts to track which keyslots are idle (either they have no
|
||||
encryption context programmed, or there are no in-flight struct bios
|
||||
referencing that keyslot). When a new encryption context needs a keyslot, it
|
||||
tries to find a keyslot that has already been programmed with the same
|
||||
encryption context, and if there is no such keyslot, it evicts the least
|
||||
recently used idle keyslot and programs the new encryption context into that
|
||||
one. If no idle keyslots are available, then the caller will sleep until there
|
||||
is at least one.
|
||||
|
||||
|
||||
Blk-crypto
|
||||
==========
|
||||
|
||||
The above is sufficient for simple cases, but does not work if there is a
|
||||
need for a crypto API fallback, or if we are want to use IE with layered
|
||||
devices. To these ends, we introduce blk-crypto. Blk-crypto allows us to
|
||||
present a unified view of encryption to the FS (so FS only needs to specify
|
||||
an encryption context and not worry about keyslots at all), and blk-crypto
|
||||
can decide whether to delegate the en/decryption to IE hardware or to the
|
||||
crypto API. Blk-crypto maintains an internal KSM that serves as the crypto
|
||||
API fallback.
|
||||
|
||||
Blk-crypto needs to ensure that the encryption context is programmed into the
|
||||
"correct" keyslot manager for IE. If a bio is submitted to a layered device
|
||||
that eventually passes the bio down to a device that really does support IE, we
|
||||
want the encryption context to be programmed into a keyslot for the KSM of the
|
||||
device with IE support. However, blk-crypto does not know a priori whether a
|
||||
particular device is the final device in the layering structure for a bio or
|
||||
not. So in the case that a particular device does not support IE, since it is
|
||||
possibly the final destination device for the bio, if the bio requires
|
||||
encryption (i.e. the bio is doing a write operation), blk-crypto must fallback
|
||||
to the crypto API *before* sending the bio to the device.
|
||||
|
||||
Blk-crypto ensures that:
|
||||
|
||||
- The bio's encryption context is programmed into a keyslot in the KSM of the
|
||||
request queue that the bio is being submitted to (or the crypto API fallback
|
||||
KSM if the request queue doesn't have a KSM), and that the ``bc_ksm``
|
||||
in the ``bi_crypt_context`` is set to this KSM
|
||||
|
||||
- That the bio has its own individual reference to the keyslot in this KSM.
|
||||
Once the bio passes through blk-crypto, its encryption context is programmed
|
||||
in some KSM. The "its own individual reference to the keyslot" ensures that
|
||||
keyslots can be released by each bio independently of other bios while
|
||||
ensuring that the bio has a valid reference to the keyslot when, for e.g., the
|
||||
crypto API fallback KSM in blk-crypto performs crypto on the device's behalf.
|
||||
The individual references are ensured by increasing the refcount for the
|
||||
keyslot in the ``bc_ksm`` when a bio with a programmed encryption
|
||||
context is cloned.
|
||||
|
||||
|
||||
What blk-crypto does on bio submission
|
||||
--------------------------------------
|
||||
|
||||
**Case 1:** blk-crypto is given a bio with only an encryption context that hasn't
|
||||
been programmed into any keyslot in any KSM (for e.g. a bio from the FS).
|
||||
In this case, blk-crypto will program the encryption context into the KSM of the
|
||||
request queue the bio is being submitted to (and if this KSM does not exist,
|
||||
then it will program it into blk-crypto's internal KSM for crypto API
|
||||
fallback). The KSM that this encryption context was programmed into is stored
|
||||
as the ``bc_ksm`` in the bio's ``bi_crypt_context``.
|
||||
|
||||
**Case 2:** blk-crypto is given a bio whose encryption context has already been
|
||||
programmed into a keyslot in the *crypto API fallback* KSM.
|
||||
In this case, blk-crypto does nothing; it treats the bio as not having
|
||||
specified an encryption context. Note that we cannot do here what we will do
|
||||
in Case 3 because we would have already encrypted the bio via the crypto API
|
||||
by this point.
|
||||
|
||||
**Case 3:** blk-crypto is given a bio whose encryption context has already been
|
||||
programmed into a keyslot in some KSM (that is *not* the crypto API fallback
|
||||
KSM).
|
||||
In this case, blk-crypto first releases that keyslot from that KSM and then
|
||||
treats the bio as in Case 1.
|
||||
|
||||
This way, when a device driver is processing a bio, it can be sure that
|
||||
the bio's encryption context has been programmed into some KSM (either the
|
||||
device driver's request queue's KSM, or blk-crypto's crypto API fallback KSM).
|
||||
It then simply needs to check if the bio's ``bc_ksm`` is the device's
|
||||
request queue's KSM. If so, then it should proceed with IE. If not, it should
|
||||
simply do nothing with respect to crypto, because some other KSM (perhaps the
|
||||
blk-crypto crypto API fallback KSM) is handling the en/decryption.
|
||||
|
||||
Blk-crypto will release the keyslot that is being held by the bio (and also
|
||||
decrypt it if the bio is using the crypto API fallback KSM) once
|
||||
``bio_remaining_done`` returns true for the bio.
|
||||
|
||||
|
||||
Layered Devices
|
||||
===============
|
||||
|
||||
Layered devices that wish to support IE need to create their own keyslot
|
||||
manager for their request queue, and expose whatever functionality they choose.
|
||||
When a layered device wants to pass a bio to another layer (either by
|
||||
resubmitting the same bio, or by submitting a clone), it doesn't need to do
|
||||
anything special because the bio (or the clone) will once again pass through
|
||||
blk-crypto, which will work as described in Case 3. If a layered device wants
|
||||
for some reason to do the IO by itself instead of passing it on to a child
|
||||
device, but it also chose to expose IE capabilities by setting up a KSM in its
|
||||
request queue, it is then responsible for en/decrypting the data itself. In
|
||||
such cases, the device can choose to call the blk-crypto function
|
||||
``blk_crypto_fallback_to_kernel_crypto_api`` (TODO: Not yet implemented), which will
|
||||
cause the en/decryption to be done via the crypto API fallback.
|
||||
|
||||
|
||||
Future Optimizations for layered devices
|
||||
========================================
|
||||
|
||||
Creating a keyslot manager for the layered device uses up memory for each
|
||||
keyslot, and in general, a layered device (like dm-linear) merely passes the
|
||||
request on to a "child" device, so the keyslots in the layered device itself
|
||||
might be completely unused. We can instead define a new type of KSM; the
|
||||
"passthrough KSM", that layered devices can use to let blk-crypto know that
|
||||
this layered device *will* pass the bio to some child device (and hence
|
||||
through blk-crypto again, at which point blk-crypto can program the encryption
|
||||
context, instead of programming it into the layered device's KSM). Again, if
|
||||
the device "lies" and decides to do the IO itself instead of passing it on to
|
||||
a child device, it is responsible for doing the en/decryption (and can choose
|
||||
to call ``blk_crypto_fallback_to_kernel_crypto_api``). Another use case for the
|
||||
"passthrough KSM" is for IE devices that want to manage their own keyslots/do
|
||||
not have a limited number of keyslots.
|
||||
@@ -34,6 +34,7 @@ Profiling data will only become accessible once debugfs has been mounted::
|
||||
|
||||
Coverage collection
|
||||
-------------------
|
||||
|
||||
The following program demonstrates coverage collection from within a test
|
||||
program using kcov:
|
||||
|
||||
@@ -128,6 +129,7 @@ only need to enable coverage (disable happens automatically on thread end).
|
||||
|
||||
Comparison operands collection
|
||||
------------------------------
|
||||
|
||||
Comparison operands collection is similar to coverage collection:
|
||||
|
||||
.. code-block:: c
|
||||
@@ -202,3 +204,130 @@ Comparison operands collection is similar to coverage collection:
|
||||
|
||||
Note that the kcov modes (coverage collection or comparison operands) are
|
||||
mutually exclusive.
|
||||
|
||||
Remote coverage collection
|
||||
--------------------------
|
||||
|
||||
With KCOV_ENABLE coverage is collected only for syscalls that are issued
|
||||
from the current process. With KCOV_REMOTE_ENABLE it's possible to collect
|
||||
coverage for arbitrary parts of the kernel code, provided that those parts
|
||||
are annotated with kcov_remote_start()/kcov_remote_stop().
|
||||
|
||||
This allows to collect coverage from two types of kernel background
|
||||
threads: the global ones, that are spawned during kernel boot in a limited
|
||||
number of instances (e.g. one USB hub_event() worker thread is spawned per
|
||||
USB HCD); and the local ones, that are spawned when a user interacts with
|
||||
some kernel interface (e.g. vhost workers).
|
||||
|
||||
To enable collecting coverage from a global background thread, a unique
|
||||
global handle must be assigned and passed to the corresponding
|
||||
kcov_remote_start() call. Then a userspace process can pass a list of such
|
||||
handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field of the
|
||||
kcov_remote_arg struct. This will attach the used kcov device to the code
|
||||
sections, that are referenced by those handles.
|
||||
|
||||
Since there might be many local background threads spawned from different
|
||||
userspace processes, we can't use a single global handle per annotation.
|
||||
Instead, the userspace process passes a non-zero handle through the
|
||||
common_handle field of the kcov_remote_arg struct. This common handle gets
|
||||
saved to the kcov_handle field in the current task_struct and needs to be
|
||||
passed to the newly spawned threads via custom annotations. Those threads
|
||||
should in turn be annotated with kcov_remote_start()/kcov_remote_stop().
|
||||
|
||||
Internally kcov stores handles as u64 integers. The top byte of a handle
|
||||
is used to denote the id of a subsystem that this handle belongs to, and
|
||||
the lower 4 bytes are used to denote the id of a thread instance within
|
||||
that subsystem. A reserved value 0 is used as a subsystem id for common
|
||||
handles as they don't belong to a particular subsystem. The bytes 4-7 are
|
||||
currently reserved and must be zero. In the future the number of bytes
|
||||
used for the subsystem or handle ids might be increased.
|
||||
|
||||
When a particular userspace proccess collects coverage by via a common
|
||||
handle, kcov will collect coverage for each code section that is annotated
|
||||
to use the common handle obtained as kcov_handle from the current
|
||||
task_struct. However non common handles allow to collect coverage
|
||||
selectively from different subsystems.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
struct kcov_remote_arg {
|
||||
__u32 trace_mode;
|
||||
__u32 area_size;
|
||||
__u32 num_handles;
|
||||
__aligned_u64 common_handle;
|
||||
__aligned_u64 handles[0];
|
||||
};
|
||||
|
||||
#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
|
||||
#define KCOV_DISABLE _IO('c', 101)
|
||||
#define KCOV_REMOTE_ENABLE _IOW('c', 102, struct kcov_remote_arg)
|
||||
|
||||
#define COVER_SIZE (64 << 10)
|
||||
|
||||
#define KCOV_TRACE_PC 0
|
||||
|
||||
#define KCOV_SUBSYSTEM_COMMON (0x00ull << 56)
|
||||
#define KCOV_SUBSYSTEM_USB (0x01ull << 56)
|
||||
|
||||
#define KCOV_SUBSYSTEM_MASK (0xffull << 56)
|
||||
#define KCOV_INSTANCE_MASK (0xffffffffull)
|
||||
|
||||
static inline __u64 kcov_remote_handle(__u64 subsys, __u64 inst)
|
||||
{
|
||||
if (subsys & ~KCOV_SUBSYSTEM_MASK || inst & ~KCOV_INSTANCE_MASK)
|
||||
return 0;
|
||||
return subsys | inst;
|
||||
}
|
||||
|
||||
#define KCOV_COMMON_ID 0x42
|
||||
#define KCOV_USB_BUS_NUM 1
|
||||
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
int fd;
|
||||
unsigned long *cover, n, i;
|
||||
struct kcov_remote_arg *arg;
|
||||
|
||||
fd = open("/sys/kernel/debug/kcov", O_RDWR);
|
||||
if (fd == -1)
|
||||
perror("open"), exit(1);
|
||||
if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
|
||||
perror("ioctl"), exit(1);
|
||||
cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
|
||||
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
|
||||
if ((void*)cover == MAP_FAILED)
|
||||
perror("mmap"), exit(1);
|
||||
|
||||
/* Enable coverage collection via common handle and from USB bus #1. */
|
||||
arg = calloc(1, sizeof(*arg) + sizeof(uint64_t));
|
||||
if (!arg)
|
||||
perror("calloc"), exit(1);
|
||||
arg->trace_mode = KCOV_TRACE_PC;
|
||||
arg->area_size = COVER_SIZE;
|
||||
arg->num_handles = 1;
|
||||
arg->common_handle = kcov_remote_handle(KCOV_SUBSYSTEM_COMMON,
|
||||
KCOV_COMMON_ID);
|
||||
arg->handles[0] = kcov_remote_handle(KCOV_SUBSYSTEM_USB,
|
||||
KCOV_USB_BUS_NUM);
|
||||
if (ioctl(fd, KCOV_REMOTE_ENABLE, arg))
|
||||
perror("ioctl"), free(arg), exit(1);
|
||||
free(arg);
|
||||
|
||||
/*
|
||||
* Here the user needs to trigger execution of a kernel code section
|
||||
* that is either annotated with the common handle, or to trigger some
|
||||
* activity on USB bus #1.
|
||||
*/
|
||||
sleep(2);
|
||||
|
||||
n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
|
||||
for (i = 0; i < n; i++)
|
||||
printf("0x%lx\n", cover[i + 1]);
|
||||
if (ioctl(fd, KCOV_DISABLE, 0))
|
||||
perror("ioctl"), exit(1);
|
||||
if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
|
||||
perror("munmap"), exit(1);
|
||||
if (close(fd))
|
||||
perror("close"), exit(1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
@@ -75,6 +75,15 @@ its hardware characteristcs.
|
||||
|
||||
* port or ports: same as above.
|
||||
|
||||
* Optional properties for all components:
|
||||
|
||||
* arm,coresight-loses-context-with-cpu : boolean. Indicates that the
|
||||
hardware will lose register context on CPU power down (e.g. CPUIdle).
|
||||
An example of where this may be needed are systems which contain a
|
||||
coresight component and CPU in the same power domain. When the CPU
|
||||
powers down the coresight component also powers down and loses its
|
||||
context. This property is currently only used for the ETM 4.x driver.
|
||||
|
||||
* Optional properties for ETM/PTMs:
|
||||
|
||||
* arm,cp14: must be present if the system accesses ETM/PTM management
|
||||
|
||||
@@ -35,6 +35,7 @@ Required standard properties:
|
||||
"ti,sysc-omap3-sham"
|
||||
"ti,sysc-omap-aes"
|
||||
"ti,sysc-mcasp"
|
||||
"ti,sysc-dra7-mcasp"
|
||||
"ti,sysc-usb-host-fs"
|
||||
"ti,sysc-dra7-mcan"
|
||||
|
||||
|
||||
@@ -46,7 +46,7 @@ Required properties:
|
||||
Example (R-Car H3):
|
||||
|
||||
usb2_clksel: clock-controller@e6590630 {
|
||||
compatible = "renesas,r8a77950-rcar-usb2-clock-sel",
|
||||
compatible = "renesas,r8a7795-rcar-usb2-clock-sel",
|
||||
"renesas,rcar-gen3-usb2-clock-sel";
|
||||
reg = <0 0xe6590630 0 0x02>;
|
||||
clocks = <&cpg CPG_MOD 703>, <&usb_extal>, <&usb_xtal>;
|
||||
|
||||
@@ -73,7 +73,7 @@ Example:
|
||||
};
|
||||
};
|
||||
|
||||
port@10 {
|
||||
port@a {
|
||||
reg = <10>;
|
||||
|
||||
adv7482_txa: endpoint {
|
||||
@@ -83,7 +83,7 @@ Example:
|
||||
};
|
||||
};
|
||||
|
||||
port@11 {
|
||||
port@b {
|
||||
reg = <11>;
|
||||
|
||||
adv7482_txb: endpoint {
|
||||
|
||||
@@ -19,6 +19,9 @@ Optional properties:
|
||||
- interrupt-names: must be "mdio_done_error" when there is a share interrupt fed
|
||||
to this hardware block, or must be "mdio_done" for the first interrupt and
|
||||
"mdio_error" for the second when there are separate interrupts
|
||||
- clocks: A reference to the clock supplying the MDIO bus controller
|
||||
- clock-frequency: the MDIO bus clock that must be output by the MDIO bus
|
||||
hardware, if absent, the default hardware values are used
|
||||
|
||||
Child nodes of this MDIO bus controller node are standard Ethernet PHY device
|
||||
nodes as described in Documentation/devicetree/bindings/net/phy.txt
|
||||
|
||||
27
Documentation/devicetree/bindings/rng/omap3_rom_rng.txt
Normal file
27
Documentation/devicetree/bindings/rng/omap3_rom_rng.txt
Normal file
@@ -0,0 +1,27 @@
|
||||
OMAP ROM RNG driver binding
|
||||
|
||||
Secure SoCs may provide RNG via secure ROM calls like Nokia N900 does. The
|
||||
implementation can depend on the SoC secure ROM used.
|
||||
|
||||
- compatible:
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: must be "nokia,n900-rom-rng"
|
||||
|
||||
- clocks:
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
Definition: reference to the the RNG interface clock
|
||||
|
||||
- clock-names:
|
||||
Usage: required
|
||||
Value type: <stringlist>
|
||||
Definition: must be "ick"
|
||||
|
||||
Example:
|
||||
|
||||
rom_rng: rng {
|
||||
compatible = "nokia,n900-rom-rng";
|
||||
clocks = <&rng_ick>;
|
||||
clock-names = "ick";
|
||||
};
|
||||
@@ -27,4 +27,4 @@ and valid to enable charging:
|
||||
|
||||
- "abracon,tc-diode": should be "standard" (0.6V) or "schottky" (0.3V)
|
||||
- "abracon,tc-resistor": should be <0>, <3>, <6> or <11>. 0 disables the output
|
||||
resistor, the other values are in ohm.
|
||||
resistor, the other values are in kOhm.
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user