https://source.android.com/security/bulletin/2020-02-01
CVE-2020-0030
CVE-2019-11599

* tag 'ASB-2020-02-05_4.19': (4206 commits)
  UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases
  ANDROID: Re-use SUGOV_RT_MAX_FREQ to control uclamp rt behavior
  BACKPORT: sched/fair: Make EAS wakeup placement consider uclamp restrictions
  BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions
  ANDROID: sched/core: Move SchedTune task API into UtilClamp wrappers
  ANDROID: sched/core: Add a latency-sensitive flag to uclamp
  ANDROID: sched/tune: Move SchedTune cpu API into UtilClamp wrappers
  ANDROID: init: kconfig: Only allow sched tune if !uclamp
  FROMGIT: sched/core: Fix size of rq::uclamp initialization
  FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups
  FROMGIT: sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()
  FROMGIT: sched/uclamp: Make uclamp util helpers use and return UL values
  FROMGIT: sched/uclamp: Remove uclamp_util()
  BACKPORT: sched/rt: Make RT capacity-aware
  UPSTREAM: tools headers UAPI: Sync sched.h with the kernel
  UPSTREAM: sched/uclamp: Fix overzealous type replacement
  UPSTREAM: sched/uclamp: Fix incorrect condition
  UPSTREAM: sched/core: Fix compilation error when cgroup not selected
  UPSTREAM: sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI logic and code
  UPSTREAM: sched/uclamp: Always use 'enum uclamp_id' for clamp_id values
  ...

Conflicts:
	drivers/char/random.c
	drivers/devfreq/devfreq.c
	drivers/gpu/drm/drm_fb_helper.c
	drivers/media/i2c/ov2680.c
	drivers/media/i2c/ov2685.c
	drivers/media/i2c/ov5670.c
	drivers/media/i2c/ov5695.c
	drivers/media/usb/uvc/uvc_driver.c
	drivers/mmc/host/cqhci.c
	drivers/spi/spi-rockchip.c
	drivers/usb/dwc2/params.c
	drivers/usb/dwc3/debugfs.c
	drivers/usb/dwc3/gadget.c
	drivers/usb/serial/usb_wwan.c
	include/linux/clk-provider.h
	include/linux/mfd/rk808.h
	kernel/cpu.c
	sound/usb/quirks.c

- Export symbol mm_trace_rss_stat on mm/memory.c for GPU drivers.
- Fix sound/usb/pcm.c for SNDRV_PCM_TRIGGER_SUSPEND.
- Enable DEBUG_FS which is not selected by TRACING.
- Disable of_devlink which broken boot. of_devlink is enabled by commit
  ba3aa33b8f ("ANDROID: of: property: Enable of_devlink by default").
- Add CLK_DONT_HOLD_STATE and CLK_KEEP_REQ_RATE to clk_flags
  on drivers/clk/clk.c.

Change-Id: I500ca1bbc735753f9c8251ed2ac8ad757d5a24a4
Signed-off-by: Tao Huang <huangtao@rock-chips.com>
This commit is contained in:
Tao Huang
2020-02-17 15:47:12 +08:00
3568 changed files with 113350 additions and 191436 deletions

View File

@@ -199,7 +199,7 @@ Description:
What: /sys/bus/iio/devices/iio:deviceX/in_positionrelative_x_raw
What: /sys/bus/iio/devices/iio:deviceX/in_positionrelative_y_raw
KernelVersion: 4.18
KernelVersion: 4.19
Contact: linux-iio@vger.kernel.org
Description:
Relative position in direction x or y on a pad (may be

View File

@@ -4,7 +4,7 @@ KernelVersion: 3.10
Contact: Samuel Ortiz <sameo@linux.intel.com>
linux-mei@linux.intel.com
Description: Stores the same MODALIAS value emitted by uevent
Format: mei:<mei device name>:<device uuid>:
Format: mei:<mei device name>:<device uuid>:<protocol version>
What: /sys/bus/mei/devices/.../name
Date: May 2015

View File

@@ -478,6 +478,8 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities

View File

@@ -31,6 +31,12 @@ Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>
Description:
Controls the issue rate of segment discard commands.
What: /sys/fs/f2fs/<disk>/max_blkaddr
Date: November 2019
Contact: "Ramon Pantin" <pantin@google.com>
Description:
Shows first block address of MAIN area.
What: /sys/fs/f2fs/<disk>/ipu_policy
Date: November 2013
Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com>

View File

@@ -0,0 +1,27 @@
What: /sys/kernel/ion
Date: Dec 2019
KernelVersion: 4.14.158
Contact: Suren Baghdasaryan <surenb@google.com>,
Sandeep Patil <sspatil@google.com>
Description:
The /sys/kernel/ion directory contains a snapshot of the
internal state of ION memory heaps and pools.
Users: kernel memory tuning tools
What: /sys/kernel/ion/total_heaps_kb
Date: Dec 2019
KernelVersion: 4.14.158
Contact: Suren Baghdasaryan <surenb@google.com>,
Sandeep Patil <sspatil@google.com>
Description:
The total_heaps_kb file is read-only and specifies how much
memory in Kb is allocated to ION heaps.
What: /sys/kernel/ion/total_pools_kb
Date: Dec 2019
KernelVersion: 4.14.158
Contact: Suren Baghdasaryan <surenb@google.com>,
Sandeep Patil <sspatil@google.com>
Description:
The total_pools_kb file is read-only and specifies how much
memory in Kb is allocated to ION pools.

View File

@@ -694,6 +694,12 @@ Conventions
informational files on the root cgroup which end up showing global
information available elsewhere shouldn't exist.
- The default time unit is microseconds. If a different unit is ever
used, an explicit unit suffix must be present.
- A parts-per quantity should use a percentage decimal with at least
two digit fractional part - e.g. 13.40.
- If a controller implements weight based resource distribution, its
interface file should be named "weight" and have the range [1,
10000] with 100 as the default. The values are chosen to allow
@@ -907,6 +913,13 @@ controller implements weight and absolute bandwidth limit models for
normal scheduling policy and absolute bandwidth allocation model for
realtime scheduling policy.
In all the above models, cycles distribution is defined only on a temporal
base and it does not account for the frequency at which tasks are executed.
The (optional) utilization clamping support allows to hint the schedutil
cpufreq governor about the minimum desired frequency which should always be
provided by a CPU, as well as the maximum desired frequency, which should not
be exceeded by a CPU.
WARNING: cgroup2 doesn't yet support control of realtime processes and
the cpu controller can only be enabled when all RT processes are in
the root cgroup. Be aware that system management software may already
@@ -972,6 +985,33 @@ All time durations are in microseconds.
Shows pressure stall information for CPU. See
Documentation/accounting/psi.txt for details.
cpu.uclamp.min
A read-write single value file which exists on non-root cgroups.
The default is "0", i.e. no utilization boosting.
The requested minimum utilization (protection) as a percentage
rational number, e.g. 12.34 for 12.34%.
This interface allows reading and setting minimum utilization clamp
values similar to the sched_setattr(2). This minimum utilization
value is used to clamp the task specific minimum utilization clamp.
The requested minimum utilization (protection) is always capped by
the current value for the maximum utilization (limit), i.e.
`cpu.uclamp.max`.
cpu.uclamp.max
A read-write single value file which exists on non-root cgroups.
The default is "max". i.e. no utilization capping
The requested maximum utilization (limit) as a percentage rational
number, e.g. 98.76 for 98.76%.
This interface allows reading and setting maximum utilization clamp
values similar to the sched_setattr(2). This maximum utilization
value is used to clamp the task specific maximum utilization clamp.
Memory
------

View File

@@ -12,3 +12,5 @@ are configurable at compile, boot or run time.
spectre
l1tf
mds
tsx_async_abort
multihit.rst

View File

@@ -265,8 +265,11 @@ time with the option "mds=". The valid arguments for this option are:
============ =============================================================
Not specifying this option is equivalent to "mds=full".
Not specifying this option is equivalent to "mds=full". For processors
that are affected by both TAA (TSX Asynchronous Abort) and MDS,
specifying just "mds=off" without an accompanying "tsx_async_abort=off"
will have no effect as the same mitigation is used for both
vulnerabilities.
Mitigation selection guide
--------------------------

View File

@@ -0,0 +1,163 @@
iTLB multihit
=============
iTLB multihit is an erratum where some processors may incur a machine check
error, possibly resulting in an unrecoverable CPU lockup, when an
instruction fetch hits multiple entries in the instruction TLB. This can
occur when the page size is changed along with either the physical address
or cache type. A malicious guest running on a virtualized system can
exploit this erratum to perform a denial of service attack.
Affected processors
-------------------
Variations of this erratum are present on most Intel Core and Xeon processor
models. The erratum is not present on:
- non-Intel processors
- Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)
- Intel processors that have the PSCHANGE_MC_NO bit set in the
IA32_ARCH_CAPABILITIES MSR.
Related CVEs
------------
The following CVE entry is related to this issue:
============== =================================================
CVE-2018-12207 Machine Check Error Avoidance on Page Size Change
============== =================================================
Problem
-------
Privileged software, including OS and virtual machine managers (VMM), are in
charge of memory management. A key component in memory management is the control
of the page tables. Modern processors use virtual memory, a technique that creates
the illusion of a very large memory for processors. This virtual space is split
into pages of a given size. Page tables translate virtual addresses to physical
addresses.
To reduce latency when performing a virtual to physical address translation,
processors include a structure, called TLB, that caches recent translations.
There are separate TLBs for instruction (iTLB) and data (dTLB).
Under this errata, instructions are fetched from a linear address translated
using a 4 KB translation cached in the iTLB. Privileged software modifies the
paging structure so that the same linear address using large page size (2 MB, 4
MB, 1 GB) with a different physical address or memory type. After the page
structure modification but before the software invalidates any iTLB entries for
the linear address, a code fetch that happens on the same linear address may
cause a machine-check error which can result in a system hang or shutdown.
Attack scenarios
----------------
Attacks against the iTLB multihit erratum can be mounted from malicious
guests in a virtualized system.
iTLB multihit system information
--------------------------------
The Linux kernel provides a sysfs interface to enumerate the current iTLB
multihit status of the system:whether the system is vulnerable and which
mitigations are active. The relevant sysfs file is:
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
The possible values in this file are:
.. list-table::
* - Not affected
- The processor is not vulnerable.
* - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
* - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
Enumeration of the erratum
--------------------------------
A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr
and will be set on CPU's which are mitigated against this issue.
======================================= =========== ===============================
IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model
IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model
IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable
======================================= =========== ===============================
Mitigation mechanism
-------------------------
This erratum can be mitigated by restricting the use of large page sizes to
non-executable pages. This forces all iTLB entries to be 4K, and removes
the possibility of multiple hits.
In order to mitigate the vulnerability, KVM initially marks all huge pages
as non-executable. If the guest attempts to execute in one of those pages,
the page is broken down into 4K pages, which are then marked executable.
If EPT is disabled or not available on the host, KVM is in control of TLB
flushes and the problematic situation cannot happen. However, the shadow
EPT paging mechanism used by nested virtualization is vulnerable, because
the nested guest can trigger multiple iTLB hits by modifying its own
(non-nested) page tables. For simplicity, KVM will make large pages
non-executable in all shadow paging modes.
Mitigation control on the kernel command line and KVM - module parameter
------------------------------------------------------------------------
The KVM hypervisor mitigation mechanism for marking huge pages as
non-executable can be controlled with a module parameter "nx_huge_pages=".
The kernel command line allows to control the iTLB multihit mitigations at
boot time with the option "kvm.nx_huge_pages=".
The valid arguments for these options are:
========== ================================================================
force Mitigation is enabled. In this case, the mitigation implements
non-executable huge pages in Linux kernel KVM module. All huge
pages in the EPT are marked as non-executable.
If a guest attempts to execute in one of those pages, the page is
broken down into 4K pages, which are then marked executable.
off Mitigation is disabled.
auto Enable mitigation only if the platform is affected and the kernel
was not booted with the "mitigations=off" command line parameter.
This is the default option.
========== ================================================================
Mitigation selection guide
--------------------------
1. No virtualization in use
^^^^^^^^^^^^^^^^^^^^^^^^^^^
The system is protected by the kernel unconditionally and no further
action is required.
2. Virtualization with trusted guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the guest comes from a trusted source, you may assume that the guest will
not attempt to maliciously exploit these errata and no further action is
required.
3. Virtualization with untrusted guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the guest comes from an untrusted source, the guest host kernel will need
to apply iTLB multihit mitigation via the kernel command line or kvm
module parameter.

View File

@@ -0,0 +1,279 @@
.. SPDX-License-Identifier: GPL-2.0
TAA - TSX Asynchronous Abort
======================================
TAA is a hardware vulnerability that allows unprivileged speculative access to
data which is available in various CPU internal buffers by using asynchronous
aborts within an Intel TSX transactional region.
Affected processors
-------------------
This vulnerability only affects Intel processors that support Intel
Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
also mitigate against TAA.
Whether a processor is affected or not can be read out from the TAA
vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
Related CVEs
------------
The following CVE entry is related to this TAA issue:
============== ===== ===================================================
CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
microprocessors utilizing speculative execution may
allow an authenticated user to potentially enable
information disclosure via a side channel with
local access.
============== ===== ===================================================
Problem
-------
When performing store, load or L1 refill operations, processors write
data into temporary microarchitectural structures (buffers). The data in
those buffers can be forwarded to load operations as an optimization.
Intel TSX is an extension to the x86 instruction set architecture that adds
hardware transactional memory support to improve performance of multi-threaded
software. TSX lets the processor expose and exploit concurrency hidden in an
application due to dynamically avoiding unnecessary synchronization.
TSX supports atomic memory transactions that are either committed (success) or
aborted. During an abort, operations that happened within the transactional region
are rolled back. An asynchronous abort takes place, among other options, when a
different thread accesses a cache line that is also used within the transactional
region when that access might lead to a data race.
Immediately after an uncompleted asynchronous abort, certain speculatively
executed loads may read data from those internal buffers and pass it to dependent
operations. This can be then used to infer the value via a cache side channel
attack.
Because the buffers are potentially shared between Hyper-Threads cross
Hyper-Thread attacks are possible.
The victim of a malicious actor does not need to make use of TSX. Only the
attacker needs to begin a TSX transaction and raise an asynchronous abort
which in turn potenitally leaks data stored in the buffers.
More detailed technical information is available in the TAA specific x86
architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
Attack scenarios
----------------
Attacks against the TAA vulnerability can be implemented from unprivileged
applications running on hosts or guests.
As for MDS, the attacker has no control over the memory addresses that can
be leaked. Only the victim is responsible for bringing data to the CPU. As
a result, the malicious actor has to sample as much data as possible and
then postprocess it to try to infer any useful information from it.
A potential attacker only has read access to the data. Also, there is no direct
privilege escalation by using this technique.
.. _tsx_async_abort_sys_info:
TAA system information
-----------------------
The Linux kernel provides a sysfs interface to enumerate the current TAA status
of mitigated systems. The relevant sysfs file is:
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
The possible values in this file are:
.. list-table::
* - 'Vulnerable'
- The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
- The system tries to clear the buffers but the microcode might not support the operation.
* - 'Mitigation: Clear CPU buffers'
- The microcode has been updated to clear the buffers. TSX is still enabled.
* - 'Mitigation: TSX disabled'
- TSX is disabled.
* - 'Not affected'
- The CPU is not affected by this issue.
.. _ucode_needed:
Best effort mitigation mode
^^^^^^^^^^^^^^^^^^^^^^^^^^^
If the processor is vulnerable, but the availability of the microcode-based
mitigation mechanism is not advertised via CPUID the kernel selects a best
effort mitigation mode. This mode invokes the mitigation instructions
without a guarantee that they clear the CPU buffers.
This is done to address virtualization scenarios where the host has the
microcode update applied, but the hypervisor is not yet updated to expose the
CPUID to the guest. If the host has updated microcode the protection takes
effect; otherwise a few CPU cycles are wasted pointlessly.
The state in the tsx_async_abort sysfs file reflects this situation
accordingly.
Mitigation mechanism
--------------------
The kernel detects the affected CPUs and the presence of the microcode which is
required. If a CPU is affected and the microcode is available, then the kernel
enables the mitigation by default.
The mitigation can be controlled at boot time via a kernel command line option.
See :ref:`taa_mitigation_control_command_line`.
.. _virt_mechanism:
Virtualization mitigation
^^^^^^^^^^^^^^^^^^^^^^^^^
Affected systems where the host has TAA microcode and TAA is mitigated by
having disabled TSX previously, are not vulnerable regardless of the status
of the VMs.
In all other cases, if the host either does not have the TAA microcode or
the kernel is not mitigated, the system might be vulnerable.
.. _taa_mitigation_control_command_line:
Mitigation control on the kernel command line
---------------------------------------------
The kernel command line allows to control the TAA mitigations at boot time with
the option "tsx_async_abort=". The valid arguments for this option are:
============ =============================================================
off This option disables the TAA mitigation on affected platforms.
If the system has TSX enabled (see next parameter) and the CPU
is affected, the system is vulnerable.
full TAA mitigation is enabled. If TSX is enabled, on an affected
system it will clear CPU buffers on ring transitions. On
systems which are MDS-affected and deploy MDS mitigation,
TAA is also mitigated. Specifying this option on those
systems will have no effect.
full,nosmt The same as tsx_async_abort=full, with SMT disabled on
vulnerable CPUs that have TSX enabled. This is the complete
mitigation. When TSX is disabled, SMT is not disabled because
CPU is not vulnerable to cross-thread TAA attacks.
============ =============================================================
Not specifying this option is equivalent to "tsx_async_abort=full". For
processors that are affected by both TAA and MDS, specifying just
"tsx_async_abort=off" without an accompanying "mds=off" will have no
effect as the same mitigation is used for both vulnerabilities.
The kernel command line also allows to control the TSX feature using the
parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
to control the TSX feature and the enumeration of the TSX feature bits (RTM
and HLE) in CPUID.
The valid options are:
============ =============================================================
off Disables TSX on the system.
Note that this option takes effect only on newer CPUs which are
not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
and which get the new IA32_TSX_CTRL MSR through a microcode
update. This new MSR allows for the reliable deactivation of
the TSX functionality.
on Enables TSX.
Although there are mitigations for all known security
vulnerabilities, TSX has been known to be an accelerator for
several previous speculation-related CVEs, and so there may be
unknown security risks associated with leaving it enabled.
auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
on the system.
============ =============================================================
Not specifying this option is equivalent to "tsx=off".
The following combinations of the "tsx_async_abort" and "tsx" are possible. For
affected platforms tsx=auto is equivalent to tsx=off and the result will be:
========= ========================== =========================================
tsx=on tsx_async_abort=full The system will use VERW to clear CPU
buffers. Cross-thread attacks are still
possible on SMT machines.
tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
mitigated.
tsx=on tsx_async_abort=off The system is vulnerable.
tsx=off tsx_async_abort=full TSX might be disabled if microcode
provides a TSX control MSR. If so,
system is not vulnerable.
tsx=off tsx_async_abort=full,nosmt Ditto
tsx=off tsx_async_abort=off ditto
========= ========================== =========================================
For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
"tsx" command line argument has no effect.
For the affected platforms below table indicates the mitigation status for the
combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
and TSX_CTRL_MSR.
======= ========= ============= ========================================
MDS_NO MD_CLEAR TSX_CTRL_MSR Status
======= ========= ============= ========================================
0 0 0 Vulnerable (needs microcode)
0 1 0 MDS and TAA mitigated via VERW
1 1 0 MDS fixed, TAA vulnerable if TSX enabled
because MD_CLEAR has no meaning and
VERW is not guaranteed to clear buffers
1 X 1 MDS fixed, TAA can be mitigated by
VERW or TSX_CTRL_MSR
======= ========= ============= ========================================
Mitigation selection guide
--------------------------
1. Trusted userspace and guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If all user space applications are from a trusted source and do not execute
untrusted code which is supplied externally, then the mitigation can be
disabled. The same applies to virtualized environments with trusted guests.
2. Untrusted userspace and guests
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If there are untrusted applications or guests on the system, enabling TSX
might allow a malicious actor to leak data from the host or from other
processes running on the same physical core.
If the microcode is available and the TSX is disabled on the host, attacks
are prevented in a virtualized environment as well, even if the VMs do not
explicitly enable the mitigation.
.. _taa_default_mitigations:
Default mitigations
-------------------
The kernel's default action for vulnerable processors is:
- Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).

View File

@@ -113,7 +113,7 @@
the GPE dispatcher.
This facility can be used to prevent such uncontrolled
GPE floodings.
Format: <int>
Format: <byte>
acpi_no_auto_serialize [HW,ACPI]
Disable auto-serialization of AML methods
@@ -1955,6 +1955,12 @@
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
the default is off.
kpti= [ARM64] Control page table isolation of user
and kernel address spaces.
Default: enabled on cores which need mitigation.
0: force disabled
1: force enabled
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
Default is 0 (don't ignore, but inject #GP)
@@ -1965,6 +1971,25 @@
KVM MMU at runtime.
Default is 0 (off)
kvm.nx_huge_pages=
[KVM] Controls the software workaround for the
X86_BUG_ITLB_MULTIHIT bug.
force : Always deploy workaround.
off : Never deploy workaround.
auto : Deploy workaround based on the presence of
X86_BUG_ITLB_MULTIHIT.
Default is 'auto'.
If the software workaround is enabled for the host,
guests do need not to enable it for nested guests.
kvm.nx_huge_pages_recovery_ratio=
[KVM] Controls how many 4KiB pages are periodically zapped
back to huge pages. 0 disables the recovery, otherwise if
the value is N KVM will zap 1/Nth of the 4KiB pages every
minute. The default is 60.
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)
@@ -2349,6 +2374,12 @@
SMT on vulnerable CPUs
off - Unconditionally disable MDS mitigation
On TAA-affected machines, mds=off can be prevented by
an active TAA mitigation as both vulnerabilities are
mitigated with the same mechanism so in order to disable
this mitigation, you need to specify tsx_async_abort=off
too.
Not specifying this option is equivalent to
mds=full.
@@ -2532,6 +2563,13 @@
ssbd=force-off [ARM64]
l1tf=off [X86]
mds=off [X86]
tsx_async_abort=off [X86]
kvm.nx_huge_pages=off [X86]
Exceptions:
This does not have any effect on
kvm.nx_huge_pages when
kvm.nx_huge_pages=force.
auto (default)
Mitigate all CPU vulnerabilities, but leave SMT
@@ -2547,6 +2585,7 @@
be fully mitigated, even if it means losing SMT.
Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86]
tsx_async_abort=full,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
@@ -4709,6 +4748,76 @@
marks the TSC unconditionally unstable at bootup and
avoids any further wobbles once the TSC watchdog notices.
tsx= [X86] Control Transactional Synchronization
Extensions (TSX) feature in Intel processors that
support TSX control.
This parameter controls the TSX feature. The options are:
on - Enable TSX on the system. Although there are
mitigations for all known security vulnerabilities,
TSX has been known to be an accelerator for
several previous speculation-related CVEs, and
so there may be unknown security risks associated
with leaving it enabled.
off - Disable TSX on the system. (Note that this
option takes effect only on newer CPUs which are
not vulnerable to MDS, i.e., have
MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get
the new IA32_TSX_CTRL MSR through a microcode
update. This new MSR allows for the reliable
deactivation of the TSX functionality.)
auto - Disable TSX if X86_BUG_TAA is present,
otherwise enable TSX on the system.
Not specifying this option is equivalent to tsx=off.
See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
for more details.
tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
Abort (TAA) vulnerability.
Similar to Micro-architectural Data Sampling (MDS)
certain CPUs that support Transactional
Synchronization Extensions (TSX) are vulnerable to an
exploit against CPU internal buffers which can forward
information to a disclosure gadget under certain
conditions.
In vulnerable processors, the speculatively forwarded
data can be used in a cache side channel attack, to
access data to which the attacker does not have direct
access.
This parameter controls the TAA mitigation. The
options are:
full - Enable TAA mitigation on vulnerable CPUs
if TSX is enabled.
full,nosmt - Enable TAA mitigation and disable SMT on
vulnerable CPUs. If TSX is disabled, SMT
is not disabled because CPU is not
vulnerable to cross-thread TAA attacks.
off - Unconditionally disable TAA mitigation
On MDS-affected machines, tsx_async_abort=off can be
prevented by an active MDS mitigation as both vulnerabilities
are mitigated with the same mechanism so in order to disable
this mitigation, you need to specify mds=off too.
Not specifying this option is equivalent to
tsx_async_abort=full. On CPUs which are MDS affected
and deploy MDS mitigation, TAA mitigation is not
required and doesn't provide any additional
mitigation.
For details see:
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
turbografx.map[2|3]= [HW,JOY]
TurboGraFX parallel port interface
Format:
@@ -4857,13 +4966,13 @@
Flags is a set of characters, each corresponding
to a common usb-storage quirk flag as follows:
a = SANE_SENSE (collect more than 18 bytes
of sense data);
of sense data, not on uas);
b = BAD_SENSE (don't collect more than 18
bytes of sense data);
bytes of sense data, not on uas);
c = FIX_CAPACITY (decrease the reported
device capacity by one sector);
d = NO_READ_DISC_INFO (don't use
READ_DISC_INFO command);
READ_DISC_INFO command, not on uas);
e = NO_READ_CAPACITY_16 (don't use
READ_CAPACITY_16 command);
f = NO_REPORT_OPCODES (don't use report opcodes
@@ -4878,17 +4987,18 @@
j = NO_REPORT_LUNS (don't use report luns
command, uas only);
l = NOT_LOCKABLE (don't try to lock and
unlock ejectable media);
unlock ejectable media, not on uas);
m = MAX_SECTORS_64 (don't transfer more
than 64 sectors = 32 KB at a time);
than 64 sectors = 32 KB at a time,
not on uas);
n = INITIAL_READ10 (force a retry of the
initial READ(10) command);
initial READ(10) command, not on uas);
o = CAPACITY_OK (accept the capacity
reported by the device);
reported by the device, not on uas);
p = WRITE_CACHE (the device cache is ON
by default);
by default, not on uas);
r = IGNORE_RESIDUE (the device reports
bogus residue values);
bogus residue values, not on uas);
s = SINGLE_LUN (the device has only one
Logical Unit);
t = NO_ATA_1X (don't allow ATA(12) and ATA(16)
@@ -4897,7 +5007,8 @@
w = NO_WP_DETECT (don't test whether the
medium is write-protected).
y = ALWAYS_SYNC (issue a SYNCHRONIZE_CACHE
even if the device claims no cache)
even if the device claims no cache,
not on uas)
Example: quirks=0419:aaf5:rl,0421:0433:rc
user_debug= [KNL,ARM]
@@ -5136,6 +5247,10 @@
the unplug protocol
never -- do not unplug even if version check succeeds
xen_legacy_crash [X86,XEN]
Crash from Xen panic notifier, without executing late
panic() code such as dumping handler.
xen_nopvspin [X86,XEN]
Disables the ticketlock slowpath using Xen PV
optimizations.

View File

@@ -0,0 +1,183 @@
.. SPDX-License-Identifier: GPL-2.0
=================
Inline Encryption
=================
Objective
=========
We want to support inline encryption (IE) in the kernel.
To allow for testing, we also want a crypto API fallback when actual
IE hardware is absent. We also want IE to work with layered devices
like dm and loopback (i.e. we want to be able to use the IE hardware
of the underlying devices if present, or else fall back to crypto API
en/decryption).
Constraints and notes
=====================
- IE hardware have a limited number of "keyslots" that can be programmed
with an encryption context (key, algorithm, data unit size, etc.) at any time.
One can specify a keyslot in a data request made to the device, and the
device will en/decrypt the data using the encryption context programmed into
that specified keyslot. When possible, we want to make multiple requests with
the same encryption context share the same keyslot.
- We need a way for filesystems to specify an encryption context to use for
en/decrypting a struct bio, and a device driver (like UFS) needs to be able
to use that encryption context when it processes the bio.
- We need a way for device drivers to expose their capabilities in a unified
way to the upper layers.
Design
======
We add a struct bio_crypt_ctx to struct bio that can represent an
encryption context, because we need to be able to pass this encryption
context from the FS layer to the device driver to act upon.
While IE hardware works on the notion of keyslots, the FS layer has no
knowledge of keyslots - it simply wants to specify an encryption context to
use while en/decrypting a bio.
We introduce a keyslot manager (KSM) that handles the translation from
encryption contexts specified by the FS to keyslots on the IE hardware.
This KSM also serves as the way IE hardware can expose their capabilities to
upper layers. The generic mode of operation is: each device driver that wants
to support IE will construct a KSM and set it up in its struct request_queue.
Upper layers that want to use IE on this device can then use this KSM in
the device's struct request_queue to translate an encryption context into
a keyslot. The presence of the KSM in the request queue shall be used to mean
that the device supports IE.
On the device driver end of the interface, the device driver needs to tell the
KSM how to actually manipulate the IE hardware in the device to do things like
programming the crypto key into the IE hardware into a particular keyslot. All
this is achieved through the :c:type:`struct keyslot_mgmt_ll_ops` that the
device driver passes to the KSM when creating it.
It uses refcounts to track which keyslots are idle (either they have no
encryption context programmed, or there are no in-flight struct bios
referencing that keyslot). When a new encryption context needs a keyslot, it
tries to find a keyslot that has already been programmed with the same
encryption context, and if there is no such keyslot, it evicts the least
recently used idle keyslot and programs the new encryption context into that
one. If no idle keyslots are available, then the caller will sleep until there
is at least one.
Blk-crypto
==========
The above is sufficient for simple cases, but does not work if there is a
need for a crypto API fallback, or if we are want to use IE with layered
devices. To these ends, we introduce blk-crypto. Blk-crypto allows us to
present a unified view of encryption to the FS (so FS only needs to specify
an encryption context and not worry about keyslots at all), and blk-crypto
can decide whether to delegate the en/decryption to IE hardware or to the
crypto API. Blk-crypto maintains an internal KSM that serves as the crypto
API fallback.
Blk-crypto needs to ensure that the encryption context is programmed into the
"correct" keyslot manager for IE. If a bio is submitted to a layered device
that eventually passes the bio down to a device that really does support IE, we
want the encryption context to be programmed into a keyslot for the KSM of the
device with IE support. However, blk-crypto does not know a priori whether a
particular device is the final device in the layering structure for a bio or
not. So in the case that a particular device does not support IE, since it is
possibly the final destination device for the bio, if the bio requires
encryption (i.e. the bio is doing a write operation), blk-crypto must fallback
to the crypto API *before* sending the bio to the device.
Blk-crypto ensures that:
- The bio's encryption context is programmed into a keyslot in the KSM of the
request queue that the bio is being submitted to (or the crypto API fallback
KSM if the request queue doesn't have a KSM), and that the ``bc_ksm``
in the ``bi_crypt_context`` is set to this KSM
- That the bio has its own individual reference to the keyslot in this KSM.
Once the bio passes through blk-crypto, its encryption context is programmed
in some KSM. The "its own individual reference to the keyslot" ensures that
keyslots can be released by each bio independently of other bios while
ensuring that the bio has a valid reference to the keyslot when, for e.g., the
crypto API fallback KSM in blk-crypto performs crypto on the device's behalf.
The individual references are ensured by increasing the refcount for the
keyslot in the ``bc_ksm`` when a bio with a programmed encryption
context is cloned.
What blk-crypto does on bio submission
--------------------------------------
**Case 1:** blk-crypto is given a bio with only an encryption context that hasn't
been programmed into any keyslot in any KSM (for e.g. a bio from the FS).
In this case, blk-crypto will program the encryption context into the KSM of the
request queue the bio is being submitted to (and if this KSM does not exist,
then it will program it into blk-crypto's internal KSM for crypto API
fallback). The KSM that this encryption context was programmed into is stored
as the ``bc_ksm`` in the bio's ``bi_crypt_context``.
**Case 2:** blk-crypto is given a bio whose encryption context has already been
programmed into a keyslot in the *crypto API fallback* KSM.
In this case, blk-crypto does nothing; it treats the bio as not having
specified an encryption context. Note that we cannot do here what we will do
in Case 3 because we would have already encrypted the bio via the crypto API
by this point.
**Case 3:** blk-crypto is given a bio whose encryption context has already been
programmed into a keyslot in some KSM (that is *not* the crypto API fallback
KSM).
In this case, blk-crypto first releases that keyslot from that KSM and then
treats the bio as in Case 1.
This way, when a device driver is processing a bio, it can be sure that
the bio's encryption context has been programmed into some KSM (either the
device driver's request queue's KSM, or blk-crypto's crypto API fallback KSM).
It then simply needs to check if the bio's ``bc_ksm`` is the device's
request queue's KSM. If so, then it should proceed with IE. If not, it should
simply do nothing with respect to crypto, because some other KSM (perhaps the
blk-crypto crypto API fallback KSM) is handling the en/decryption.
Blk-crypto will release the keyslot that is being held by the bio (and also
decrypt it if the bio is using the crypto API fallback KSM) once
``bio_remaining_done`` returns true for the bio.
Layered Devices
===============
Layered devices that wish to support IE need to create their own keyslot
manager for their request queue, and expose whatever functionality they choose.
When a layered device wants to pass a bio to another layer (either by
resubmitting the same bio, or by submitting a clone), it doesn't need to do
anything special because the bio (or the clone) will once again pass through
blk-crypto, which will work as described in Case 3. If a layered device wants
for some reason to do the IO by itself instead of passing it on to a child
device, but it also chose to expose IE capabilities by setting up a KSM in its
request queue, it is then responsible for en/decrypting the data itself. In
such cases, the device can choose to call the blk-crypto function
``blk_crypto_fallback_to_kernel_crypto_api`` (TODO: Not yet implemented), which will
cause the en/decryption to be done via the crypto API fallback.
Future Optimizations for layered devices
========================================
Creating a keyslot manager for the layered device uses up memory for each
keyslot, and in general, a layered device (like dm-linear) merely passes the
request on to a "child" device, so the keyslots in the layered device itself
might be completely unused. We can instead define a new type of KSM; the
"passthrough KSM", that layered devices can use to let blk-crypto know that
this layered device *will* pass the bio to some child device (and hence
through blk-crypto again, at which point blk-crypto can program the encryption
context, instead of programming it into the layered device's KSM). Again, if
the device "lies" and decides to do the IO itself instead of passing it on to
a child device, it is responsible for doing the en/decryption (and can choose
to call ``blk_crypto_fallback_to_kernel_crypto_api``). Another use case for the
"passthrough KSM" is for IE devices that want to manage their own keyslots/do
not have a limited number of keyslots.

View File

@@ -34,6 +34,7 @@ Profiling data will only become accessible once debugfs has been mounted::
Coverage collection
-------------------
The following program demonstrates coverage collection from within a test
program using kcov:
@@ -128,6 +129,7 @@ only need to enable coverage (disable happens automatically on thread end).
Comparison operands collection
------------------------------
Comparison operands collection is similar to coverage collection:
.. code-block:: c
@@ -202,3 +204,130 @@ Comparison operands collection is similar to coverage collection:
Note that the kcov modes (coverage collection or comparison operands) are
mutually exclusive.
Remote coverage collection
--------------------------
With KCOV_ENABLE coverage is collected only for syscalls that are issued
from the current process. With KCOV_REMOTE_ENABLE it's possible to collect
coverage for arbitrary parts of the kernel code, provided that those parts
are annotated with kcov_remote_start()/kcov_remote_stop().
This allows to collect coverage from two types of kernel background
threads: the global ones, that are spawned during kernel boot in a limited
number of instances (e.g. one USB hub_event() worker thread is spawned per
USB HCD); and the local ones, that are spawned when a user interacts with
some kernel interface (e.g. vhost workers).
To enable collecting coverage from a global background thread, a unique
global handle must be assigned and passed to the corresponding
kcov_remote_start() call. Then a userspace process can pass a list of such
handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field of the
kcov_remote_arg struct. This will attach the used kcov device to the code
sections, that are referenced by those handles.
Since there might be many local background threads spawned from different
userspace processes, we can't use a single global handle per annotation.
Instead, the userspace process passes a non-zero handle through the
common_handle field of the kcov_remote_arg struct. This common handle gets
saved to the kcov_handle field in the current task_struct and needs to be
passed to the newly spawned threads via custom annotations. Those threads
should in turn be annotated with kcov_remote_start()/kcov_remote_stop().
Internally kcov stores handles as u64 integers. The top byte of a handle
is used to denote the id of a subsystem that this handle belongs to, and
the lower 4 bytes are used to denote the id of a thread instance within
that subsystem. A reserved value 0 is used as a subsystem id for common
handles as they don't belong to a particular subsystem. The bytes 4-7 are
currently reserved and must be zero. In the future the number of bytes
used for the subsystem or handle ids might be increased.
When a particular userspace proccess collects coverage by via a common
handle, kcov will collect coverage for each code section that is annotated
to use the common handle obtained as kcov_handle from the current
task_struct. However non common handles allow to collect coverage
selectively from different subsystems.
.. code-block:: c
struct kcov_remote_arg {
__u32 trace_mode;
__u32 area_size;
__u32 num_handles;
__aligned_u64 common_handle;
__aligned_u64 handles[0];
};
#define KCOV_INIT_TRACE _IOR('c', 1, unsigned long)
#define KCOV_DISABLE _IO('c', 101)
#define KCOV_REMOTE_ENABLE _IOW('c', 102, struct kcov_remote_arg)
#define COVER_SIZE (64 << 10)
#define KCOV_TRACE_PC 0
#define KCOV_SUBSYSTEM_COMMON (0x00ull << 56)
#define KCOV_SUBSYSTEM_USB (0x01ull << 56)
#define KCOV_SUBSYSTEM_MASK (0xffull << 56)
#define KCOV_INSTANCE_MASK (0xffffffffull)
static inline __u64 kcov_remote_handle(__u64 subsys, __u64 inst)
{
if (subsys & ~KCOV_SUBSYSTEM_MASK || inst & ~KCOV_INSTANCE_MASK)
return 0;
return subsys | inst;
}
#define KCOV_COMMON_ID 0x42
#define KCOV_USB_BUS_NUM 1
int main(int argc, char **argv)
{
int fd;
unsigned long *cover, n, i;
struct kcov_remote_arg *arg;
fd = open("/sys/kernel/debug/kcov", O_RDWR);
if (fd == -1)
perror("open"), exit(1);
if (ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE))
perror("ioctl"), exit(1);
cover = (unsigned long*)mmap(NULL, COVER_SIZE * sizeof(unsigned long),
PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
if ((void*)cover == MAP_FAILED)
perror("mmap"), exit(1);
/* Enable coverage collection via common handle and from USB bus #1. */
arg = calloc(1, sizeof(*arg) + sizeof(uint64_t));
if (!arg)
perror("calloc"), exit(1);
arg->trace_mode = KCOV_TRACE_PC;
arg->area_size = COVER_SIZE;
arg->num_handles = 1;
arg->common_handle = kcov_remote_handle(KCOV_SUBSYSTEM_COMMON,
KCOV_COMMON_ID);
arg->handles[0] = kcov_remote_handle(KCOV_SUBSYSTEM_USB,
KCOV_USB_BUS_NUM);
if (ioctl(fd, KCOV_REMOTE_ENABLE, arg))
perror("ioctl"), free(arg), exit(1);
free(arg);
/*
* Here the user needs to trigger execution of a kernel code section
* that is either annotated with the common handle, or to trigger some
* activity on USB bus #1.
*/
sleep(2);
n = __atomic_load_n(&cover[0], __ATOMIC_RELAXED);
for (i = 0; i < n; i++)
printf("0x%lx\n", cover[i + 1]);
if (ioctl(fd, KCOV_DISABLE, 0))
perror("ioctl"), exit(1);
if (munmap(cover, COVER_SIZE * sizeof(unsigned long)))
perror("munmap"), exit(1);
if (close(fd))
perror("close"), exit(1);
return 0;
}

View File

@@ -75,6 +75,15 @@ its hardware characteristcs.
* port or ports: same as above.
* Optional properties for all components:
* arm,coresight-loses-context-with-cpu : boolean. Indicates that the
hardware will lose register context on CPU power down (e.g. CPUIdle).
An example of where this may be needed are systems which contain a
coresight component and CPU in the same power domain. When the CPU
powers down the coresight component also powers down and loses its
context. This property is currently only used for the ETM 4.x driver.
* Optional properties for ETM/PTMs:
* arm,cp14: must be present if the system accesses ETM/PTM management

View File

@@ -35,6 +35,7 @@ Required standard properties:
"ti,sysc-omap3-sham"
"ti,sysc-omap-aes"
"ti,sysc-mcasp"
"ti,sysc-dra7-mcasp"
"ti,sysc-usb-host-fs"
"ti,sysc-dra7-mcan"

View File

@@ -46,7 +46,7 @@ Required properties:
Example (R-Car H3):
usb2_clksel: clock-controller@e6590630 {
compatible = "renesas,r8a77950-rcar-usb2-clock-sel",
compatible = "renesas,r8a7795-rcar-usb2-clock-sel",
"renesas,rcar-gen3-usb2-clock-sel";
reg = <0 0xe6590630 0 0x02>;
clocks = <&cpg CPG_MOD 703>, <&usb_extal>, <&usb_xtal>;

View File

@@ -73,7 +73,7 @@ Example:
};
};
port@10 {
port@a {
reg = <10>;
adv7482_txa: endpoint {
@@ -83,7 +83,7 @@ Example:
};
};
port@11 {
port@b {
reg = <11>;
adv7482_txb: endpoint {

View File

@@ -19,6 +19,9 @@ Optional properties:
- interrupt-names: must be "mdio_done_error" when there is a share interrupt fed
to this hardware block, or must be "mdio_done" for the first interrupt and
"mdio_error" for the second when there are separate interrupts
- clocks: A reference to the clock supplying the MDIO bus controller
- clock-frequency: the MDIO bus clock that must be output by the MDIO bus
hardware, if absent, the default hardware values are used
Child nodes of this MDIO bus controller node are standard Ethernet PHY device
nodes as described in Documentation/devicetree/bindings/net/phy.txt

View File

@@ -0,0 +1,27 @@
OMAP ROM RNG driver binding
Secure SoCs may provide RNG via secure ROM calls like Nokia N900 does. The
implementation can depend on the SoC secure ROM used.
- compatible:
Usage: required
Value type: <string>
Definition: must be "nokia,n900-rom-rng"
- clocks:
Usage: required
Value type: <prop-encoded-array>
Definition: reference to the the RNG interface clock
- clock-names:
Usage: required
Value type: <stringlist>
Definition: must be "ick"
Example:
rom_rng: rng {
compatible = "nokia,n900-rom-rng";
clocks = <&rng_ick>;
clock-names = "ick";
};

View File

@@ -27,4 +27,4 @@ and valid to enable charging:
- "abracon,tc-diode": should be "standard" (0.6V) or "schottky" (0.3V)
- "abracon,tc-resistor": should be <0>, <3>, <6> or <11>. 0 disables the output
resistor, the other values are in ohm.
resistor, the other values are in kOhm.

Some files were not shown because too many files have changed in this diff Show More