mirror of
https://github.com/armbian/linux-cix.git
synced 2026-01-06 12:30:45 -08:00
Merge tag 'docs-5.2a' of git://git.lwn.net/linux
Pull more documentation updates from Jonathan Corbet: "Some late arriving documentation changes. In particular, this contains the conversion of the x86 docs to RST, which has been in the works for some time but needed a couple of final tweaks" * tag 'docs-5.2a' of git://git.lwn.net/linux: (29 commits) Documentation: x86: convert x86_64/machinecheck to reST Documentation: x86: convert x86_64/cpu-hotplug-spec to reST Documentation: x86: convert x86_64/fake-numa-for-cpusets to reST Documentation: x86: convert x86_64/5level-paging.txt to reST Documentation: x86: convert x86_64/mm.txt to reST Documentation: x86: convert x86_64/uefi.txt to reST Documentation: x86: convert x86_64/boot-options.txt to reST Documentation: x86: convert i386/IO-APIC.txt to reST Documentation: x86: convert usb-legacy-support.txt to reST Documentation: x86: convert orc-unwinder.txt to reST Documentation: x86: convert resctrl_ui.txt to reST Documentation: x86: convert microcode.txt to reST Documentation: x86: convert pti.txt to reST Documentation: x86: convert amd-memory-encryption.txt to reST Documentation: x86: convert intel_mpx.txt to reST Documentation: x86: convert protection-keys.txt to reST Documentation: x86: convert pat.txt to reST Documentation: x86: convert mtrr.txt to reST Documentation: x86: convert tlb.txt to reST Documentation: x86: convert zero-page.txt to reST ...
This commit is contained in:
@@ -112,6 +112,7 @@ implementation.
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
x86/index
|
||||
sh/index
|
||||
|
||||
Filesystem Documentation
|
||||
|
||||
@@ -1915,7 +1915,10 @@ The following commonly-used handler.action pairs are available:
|
||||
|
||||
The 'matching.event' specification is simply the fully qualified
|
||||
event name of the event that matches the target event for the
|
||||
onmatch() functionality, in the form 'system.event_name'.
|
||||
onmatch() functionality, in the form 'system.event_name'. Histogram
|
||||
keys of both events are compared to find if events match. In case
|
||||
multiple histogram keys are used, they all must match in the specified
|
||||
order.
|
||||
|
||||
Finally, the number and type of variables/fields in the 'param
|
||||
list' must match the number and types of the fields in the
|
||||
@@ -1978,9 +1981,9 @@ The following commonly-used handler.action pairs are available:
|
||||
/sys/kernel/debug/tracing/events/sched/sched_waking/trigger
|
||||
|
||||
Then, when the corresponding thread is actually scheduled onto the
|
||||
CPU by a sched_switch event, calculate the latency and use that
|
||||
along with another variable and an event field to generate a
|
||||
wakeup_latency synthetic event::
|
||||
CPU by a sched_switch event (saved_pid matches next_pid), calculate
|
||||
the latency and use that along with another variable and an event field
|
||||
to generate a wakeup_latency synthetic event::
|
||||
|
||||
# echo 'hist:keys=next_pid:wakeup_lat=common_timestamp.usecs-$ts0:\
|
||||
onmatch(sched.sched_waking).wakeup_latency($wakeup_lat,\
|
||||
|
||||
@@ -249,13 +249,13 @@ essere categorizzate in:
|
||||
|
||||
|
|
||||
|
||||
2. Licenze non raccomandate:
|
||||
2. Licenze deprecate:
|
||||
|
||||
Questo tipo di licenze dovrebbero essere usate solo per codice già esistente
|
||||
o quando si prende codice da altri progetti. Le licenze sono disponibili
|
||||
nei sorgenti del kernel nella cartella::
|
||||
|
||||
LICENSES/other/
|
||||
LICENSES/deprecated/
|
||||
|
||||
I file in questa cartella contengono il testo completo della licenza e i
|
||||
`Metatag`_. Il nome di questi file è lo stesso usato come identificatore
|
||||
@@ -263,14 +263,14 @@ essere categorizzate in:
|
||||
|
||||
Esempi::
|
||||
|
||||
LICENSES/other/ISC
|
||||
LICENSES/deprecated/ISC
|
||||
|
||||
Contiene il testo della licenza Internet System Consortium e i suoi
|
||||
metatag::
|
||||
|
||||
LICENSES/other/ZLib
|
||||
LICENSES/deprecated/GPL-1.0
|
||||
|
||||
Contiene il testo della licenza ZLIB e i suoi metatag.
|
||||
Contiene il testo della versione 1 della licenza GPL e i suoi metatag.
|
||||
|
||||
Metatag:
|
||||
|
||||
@@ -294,7 +294,55 @@ essere categorizzate in:
|
||||
|
||||
|
|
||||
|
||||
3. _`Eccezioni`:
|
||||
3. Solo per doppie licenze
|
||||
|
||||
Queste licenze dovrebbero essere usate solamente per codice licenziato in
|
||||
combinazione con un'altra licenza che solitamente è quella preferita.
|
||||
Queste licenze sono disponibili nei sorgenti del kernel nella cartella::
|
||||
|
||||
LICENSES/dual
|
||||
|
||||
I file in questa cartella contengono il testo completo della rispettiva
|
||||
licenza e i suoi `Metatags`_. I nomi dei file sono identici agli
|
||||
identificatori di licenza SPDX che dovrebbero essere usati nei file
|
||||
sorgenti.
|
||||
|
||||
Esempi::
|
||||
|
||||
LICENSES/dual/MPL-1.1
|
||||
|
||||
Questo file contiene il testo della versione 1.1 della licenza *Mozilla
|
||||
Pulic License* e i metatag necessari::
|
||||
|
||||
LICENSES/dual/Apache-2.0
|
||||
|
||||
Questo file contiene il testo della versione 2.0 della licenza Apache e i
|
||||
metatag necessari.
|
||||
|
||||
Metatag:
|
||||
|
||||
I requisiti per le 'altre' ('*other*') licenze sono identici a quelli per le
|
||||
`Licenze raccomandate`_.
|
||||
|
||||
Esempio del formato del file::
|
||||
|
||||
Valid-License-Identifier: MPL-1.1
|
||||
SPDX-URL: https://spdx.org/licenses/MPL-1.1.html
|
||||
Usage-Guide:
|
||||
Do NOT use. The MPL-1.1 is not GPL2 compatible. It may only be used for
|
||||
dual-licensed files where the other license is GPL2 compatible.
|
||||
If you end up using this it MUST be used together with a GPL2 compatible
|
||||
license using "OR".
|
||||
To use the Mozilla Public License version 1.1 put the following SPDX
|
||||
tag/value pair into a comment according to the placement guidelines in
|
||||
the licensing rules documentation:
|
||||
SPDX-License-Identifier: MPL-1.1
|
||||
License-Text:
|
||||
Full license text
|
||||
|
||||
|
|
||||
|
||||
4. _`Eccezioni`:
|
||||
|
||||
Alcune licenze possono essere corrette con delle eccezioni che forniscono
|
||||
diritti aggiuntivi. Queste eccezioni sono disponibili nei sorgenti del
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=====================
|
||||
AMD Memory Encryption
|
||||
=====================
|
||||
|
||||
Secure Memory Encryption (SME) and Secure Encrypted Virtualization (SEV) are
|
||||
features found on AMD processors.
|
||||
|
||||
@@ -34,7 +40,7 @@ is operating in 64-bit or 32-bit PAE mode, in all other modes the SEV hardware
|
||||
forces the memory encryption bit to 1.
|
||||
|
||||
Support for SME and SEV can be determined through the CPUID instruction. The
|
||||
CPUID function 0x8000001f reports information related to SME:
|
||||
CPUID function 0x8000001f reports information related to SME::
|
||||
|
||||
0x8000001f[eax]:
|
||||
Bit[0] indicates support for SME
|
||||
@@ -48,14 +54,14 @@ CPUID function 0x8000001f reports information related to SME:
|
||||
addresses)
|
||||
|
||||
If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
|
||||
determine if SME is enabled and/or to enable memory encryption:
|
||||
determine if SME is enabled and/or to enable memory encryption::
|
||||
|
||||
0xc0010010:
|
||||
Bit[23] 0 = memory encryption features are disabled
|
||||
1 = memory encryption features are enabled
|
||||
|
||||
If SEV is supported, MSR 0xc0010131 (MSR_AMD64_SEV) can be used to determine if
|
||||
SEV is active:
|
||||
SEV is active::
|
||||
|
||||
0xc0010131:
|
||||
Bit[0] 0 = memory encryption is not active
|
||||
@@ -68,6 +74,7 @@ requirements for the system. If this bit is not set upon Linux startup then
|
||||
Linux itself will not set it and memory encryption will not be possible.
|
||||
|
||||
The state of SME in the Linux kernel can be documented as follows:
|
||||
|
||||
- Supported:
|
||||
The CPU supports SME (determined through CPUID instruction).
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -1,52 +1,58 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============
|
||||
Early Printk
|
||||
============
|
||||
|
||||
Mini-HOWTO for using the earlyprintk=dbgp boot option with a
|
||||
USB2 Debug port key and a debug cable, on x86 systems.
|
||||
|
||||
You need two computers, the 'USB debug key' special gadget and
|
||||
and two USB cables, connected like this:
|
||||
and two USB cables, connected like this::
|
||||
|
||||
[host/target] <-------> [USB debug key] <-------> [client/console]
|
||||
|
||||
1. There are a number of specific hardware requirements:
|
||||
Hardware requirements
|
||||
=====================
|
||||
|
||||
a.) Host/target system needs to have USB debug port capability.
|
||||
a) Host/target system needs to have USB debug port capability.
|
||||
|
||||
You can check this capability by looking at a 'Debug port' bit in
|
||||
the lspci -vvv output:
|
||||
You can check this capability by looking at a 'Debug port' bit in
|
||||
the lspci -vvv output::
|
||||
|
||||
# lspci -vvv
|
||||
...
|
||||
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) (prog-if 20 [EHCI])
|
||||
Subsystem: Lenovo ThinkPad T61
|
||||
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
|
||||
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
|
||||
Latency: 0
|
||||
Interrupt: pin D routed to IRQ 19
|
||||
Region 0: Memory at fe227000 (32-bit, non-prefetchable) [size=1K]
|
||||
Capabilities: [50] Power Management version 2
|
||||
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
|
||||
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
|
||||
Capabilities: [58] Debug port: BAR=1 offset=00a0
|
||||
# lspci -vvv
|
||||
...
|
||||
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03) (prog-if 20 [EHCI])
|
||||
Subsystem: Lenovo ThinkPad T61
|
||||
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
|
||||
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
|
||||
Latency: 0
|
||||
Interrupt: pin D routed to IRQ 19
|
||||
Region 0: Memory at fe227000 (32-bit, non-prefetchable) [size=1K]
|
||||
Capabilities: [50] Power Management version 2
|
||||
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
|
||||
Status: D0 PME-Enable- DSel=0 DScale=0 PME+
|
||||
Capabilities: [58] Debug port: BAR=1 offset=00a0
|
||||
^^^^^^^^^^^ <==================== [ HERE ]
|
||||
Kernel driver in use: ehci_hcd
|
||||
Kernel modules: ehci-hcd
|
||||
...
|
||||
Kernel driver in use: ehci_hcd
|
||||
Kernel modules: ehci-hcd
|
||||
...
|
||||
|
||||
( If your system does not list a debug port capability then you probably
|
||||
won't be able to use the USB debug key. )
|
||||
.. note::
|
||||
If your system does not list a debug port capability then you probably
|
||||
won't be able to use the USB debug key.
|
||||
|
||||
b.) You also need a NetChip USB debug cable/key:
|
||||
b) You also need a NetChip USB debug cable/key:
|
||||
|
||||
http://www.plxtech.com/products/NET2000/NET20DC/default.asp
|
||||
|
||||
This is a small blue plastic connector with two USB connections;
|
||||
it draws power from its USB connections.
|
||||
|
||||
c.) You need a second client/console system with a high speed USB 2.0
|
||||
port.
|
||||
c) You need a second client/console system with a high speed USB 2.0 port.
|
||||
|
||||
d.) The NetChip device must be plugged directly into the physical
|
||||
debug port on the "host/target" system. You cannot use a USB hub in
|
||||
d) The NetChip device must be plugged directly into the physical
|
||||
debug port on the "host/target" system. You cannot use a USB hub in
|
||||
between the physical debug port and the "host/target" system.
|
||||
|
||||
The EHCI debug controller is bound to a specific physical USB
|
||||
@@ -65,29 +71,31 @@ and two USB cables, connected like this:
|
||||
to the hardware vendor, because there is no reason not to wire
|
||||
this port into one of the physically accessible ports.
|
||||
|
||||
e.) It is also important to note, that many versions of the NetChip
|
||||
e) It is also important to note, that many versions of the NetChip
|
||||
device require the "client/console" system to be plugged into the
|
||||
right hand side of the device (with the product logo facing up and
|
||||
readable left to right). The reason being is that the 5 volt
|
||||
power supply is taken from only one side of the device and it
|
||||
must be the side that does not get rebooted.
|
||||
|
||||
2. Software requirements:
|
||||
Software requirements
|
||||
=====================
|
||||
|
||||
a.) On the host/target system:
|
||||
a) On the host/target system:
|
||||
|
||||
You need to enable the following kernel config option:
|
||||
You need to enable the following kernel config option::
|
||||
|
||||
CONFIG_EARLY_PRINTK_DBGP=y
|
||||
|
||||
And you need to add the boot command line: "earlyprintk=dbgp".
|
||||
|
||||
(If you are using Grub, append it to the 'kernel' line in
|
||||
/etc/grub.conf. If you are using Grub2 on a BIOS firmware system,
|
||||
append it to the 'linux' line in /boot/grub2/grub.cfg. If you are
|
||||
using Grub2 on an EFI firmware system, append it to the 'linux'
|
||||
or 'linuxefi' line in /boot/grub2/grub.cfg or
|
||||
/boot/efi/EFI/<distro>/grub.cfg.)
|
||||
.. note::
|
||||
If you are using Grub, append it to the 'kernel' line in
|
||||
/etc/grub.conf. If you are using Grub2 on a BIOS firmware system,
|
||||
append it to the 'linux' line in /boot/grub2/grub.cfg. If you are
|
||||
using Grub2 on an EFI firmware system, append it to the 'linux'
|
||||
or 'linuxefi' line in /boot/grub2/grub.cfg or
|
||||
/boot/efi/EFI/<distro>/grub.cfg.
|
||||
|
||||
On systems with more than one EHCI debug controller you must
|
||||
specify the correct EHCI debug controller number. The ordering
|
||||
@@ -96,14 +104,15 @@ and two USB cables, connected like this:
|
||||
controller. To use the second EHCI debug controller, you would
|
||||
use the command line: "earlyprintk=dbgp1"
|
||||
|
||||
NOTE: normally earlyprintk console gets turned off once the
|
||||
regular console is alive - use "earlyprintk=dbgp,keep" to keep
|
||||
this channel open beyond early bootup. This can be useful for
|
||||
debugging crashes under Xorg, etc.
|
||||
.. note::
|
||||
normally earlyprintk console gets turned off once the
|
||||
regular console is alive - use "earlyprintk=dbgp,keep" to keep
|
||||
this channel open beyond early bootup. This can be useful for
|
||||
debugging crashes under Xorg, etc.
|
||||
|
||||
b.) On the client/console system:
|
||||
b) On the client/console system:
|
||||
|
||||
You should enable the following kernel config option:
|
||||
You should enable the following kernel config option::
|
||||
|
||||
CONFIG_USB_SERIAL_DEBUG=y
|
||||
|
||||
@@ -115,27 +124,28 @@ and two USB cables, connected like this:
|
||||
it up to use /dev/ttyUSB0 - or use a raw 'cat /dev/ttyUSBx' to
|
||||
see the raw output.
|
||||
|
||||
c.) On Nvidia Southbridge based systems: the kernel will try to probe
|
||||
c) On Nvidia Southbridge based systems: the kernel will try to probe
|
||||
and find out which port has a debug device connected.
|
||||
|
||||
3. Testing that it works fine:
|
||||
Testing
|
||||
=======
|
||||
|
||||
You can test the output by using earlyprintk=dbgp,keep and provoking
|
||||
kernel messages on the host/target system. You can provoke a harmless
|
||||
kernel message by for example doing:
|
||||
You can test the output by using earlyprintk=dbgp,keep and provoking
|
||||
kernel messages on the host/target system. You can provoke a harmless
|
||||
kernel message by for example doing::
|
||||
|
||||
echo h > /proc/sysrq-trigger
|
||||
|
||||
On the host/target system you should see this help line in "dmesg" output:
|
||||
On the host/target system you should see this help line in "dmesg" output::
|
||||
|
||||
SysRq : HELP : loglevel(0-9) reBoot Crashdump terminate-all-tasks(E) memory-full-oom-kill(F) kill-all-tasks(I) saK show-backtrace-all-active-cpus(L) show-memory-usage(M) nice-all-RT-tasks(N) powerOff show-registers(P) show-all-timers(Q) unRaw Sync show-task-states(T) Unmount show-blocked-tasks(W) dump-ftrace-buffer(Z)
|
||||
|
||||
On the client/console system do:
|
||||
On the client/console system do::
|
||||
|
||||
cat /dev/ttyUSB0
|
||||
|
||||
And you should see the help line above displayed shortly after you've
|
||||
provoked it on the host system.
|
||||
And you should see the help line above displayed shortly after you've
|
||||
provoked it on the host system.
|
||||
|
||||
If it does not work then please ask about it on the linux-kernel@vger.kernel.org
|
||||
mailing list or contact the x86 maintainers.
|
||||
@@ -1,3 +1,9 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==============
|
||||
Kernel Entries
|
||||
==============
|
||||
|
||||
This file documents some of the kernel entries in
|
||||
arch/x86/entry/entry_64.S. A lot of this explanation is adapted from
|
||||
an email from Ingo Molnar:
|
||||
@@ -59,7 +65,7 @@ Now, there's a secondary complication: there's a cheap way to test
|
||||
which mode the CPU is in and an expensive way.
|
||||
|
||||
The cheap way is to pick this info off the entry frame on the kernel
|
||||
stack, from the CS of the ptregs area of the kernel stack:
|
||||
stack, from the CS of the ptregs area of the kernel stack::
|
||||
|
||||
xorl %ebx,%ebx
|
||||
testl $3,CS+8(%rsp)
|
||||
@@ -67,7 +73,7 @@ stack, from the CS of the ptregs area of the kernel stack:
|
||||
SWAPGS
|
||||
|
||||
The expensive (paranoid) way is to read back the MSR_GS_BASE value
|
||||
(which is what SWAPGS modifies):
|
||||
(which is what SWAPGS modifies)::
|
||||
|
||||
movl $1,%ebx
|
||||
movl $MSR_GS_BASE,%ecx
|
||||
@@ -76,7 +82,7 @@ The expensive (paranoid) way is to read back the MSR_GS_BASE value
|
||||
js 1f /* negative -> in kernel */
|
||||
SWAPGS
|
||||
xorl %ebx,%ebx
|
||||
1: ret
|
||||
1: ret
|
||||
|
||||
If we are at an interrupt or user-trap/gate-alike boundary then we can
|
||||
use the faster check: the stack will be a reliable indicator of
|
||||
@@ -1,5 +1,10 @@
|
||||
Kernel level exception handling in Linux
|
||||
Commentary by Joerg Pommnitz <joerg@raleigh.ibm.com>
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===============================
|
||||
Kernel level exception handling
|
||||
===============================
|
||||
|
||||
Commentary by Joerg Pommnitz <joerg@raleigh.ibm.com>
|
||||
|
||||
When a process runs in kernel mode, it often has to access user
|
||||
mode memory whose address has been passed by an untrusted program.
|
||||
@@ -25,9 +30,9 @@ How does this work?
|
||||
|
||||
Whenever the kernel tries to access an address that is currently not
|
||||
accessible, the CPU generates a page fault exception and calls the
|
||||
page fault handler
|
||||
page fault handler::
|
||||
|
||||
void do_page_fault(struct pt_regs *regs, unsigned long error_code)
|
||||
void do_page_fault(struct pt_regs *regs, unsigned long error_code)
|
||||
|
||||
in arch/x86/mm/fault.c. The parameters on the stack are set up by
|
||||
the low level assembly glue in arch/x86/kernel/entry_32.S. The parameter
|
||||
@@ -57,73 +62,74 @@ as an example. The definition is somewhat hard to follow, so let's peek at
|
||||
the code generated by the preprocessor and the compiler. I selected
|
||||
the get_user call in drivers/char/sysrq.c for a detailed examination.
|
||||
|
||||
The original code in sysrq.c line 587:
|
||||
The original code in sysrq.c line 587::
|
||||
|
||||
get_user(c, buf);
|
||||
|
||||
The preprocessor output (edited to become somewhat readable):
|
||||
The preprocessor output (edited to become somewhat readable)::
|
||||
|
||||
(
|
||||
{
|
||||
long __gu_err = - 14 , __gu_val = 0;
|
||||
const __typeof__(*( ( buf ) )) *__gu_addr = ((buf));
|
||||
if (((((0 + current_set[0])->tss.segment) == 0x18 ) ||
|
||||
(((sizeof(*(buf))) <= 0xC0000000UL) &&
|
||||
((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf)))))))
|
||||
do {
|
||||
__gu_err = 0;
|
||||
switch ((sizeof(*(buf)))) {
|
||||
case 1:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "b" " %2,%" "b" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "b" " %" "b" "1,%" "b" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n"
|
||||
" .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ;
|
||||
break;
|
||||
case 2:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "w" " %2,%" "w" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "w" " %" "w" "1,%" "w" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n"
|
||||
" .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err ));
|
||||
break;
|
||||
case 4:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "l" " %2,%" "" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "l" " %" "" "1,%" "" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n" " .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err));
|
||||
break;
|
||||
default:
|
||||
(__gu_val) = __get_user_bad();
|
||||
}
|
||||
} while (0) ;
|
||||
((c)) = (__typeof__(*((buf))))__gu_val;
|
||||
__gu_err;
|
||||
}
|
||||
);
|
||||
(
|
||||
{
|
||||
long __gu_err = - 14 , __gu_val = 0;
|
||||
const __typeof__(*( ( buf ) )) *__gu_addr = ((buf));
|
||||
if (((((0 + current_set[0])->tss.segment) == 0x18 ) ||
|
||||
(((sizeof(*(buf))) <= 0xC0000000UL) &&
|
||||
((unsigned long)(__gu_addr ) <= 0xC0000000UL - (sizeof(*(buf)))))))
|
||||
do {
|
||||
__gu_err = 0;
|
||||
switch ((sizeof(*(buf)))) {
|
||||
case 1:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "b" " %2,%" "b" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "b" " %" "b" "1,%" "b" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n"
|
||||
" .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=q" (__gu_val): "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err )) ;
|
||||
break;
|
||||
case 2:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "w" " %2,%" "w" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "w" " %" "w" "1,%" "w" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n"
|
||||
" .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"( __gu_err ));
|
||||
break;
|
||||
case 4:
|
||||
__asm__ __volatile__(
|
||||
"1: mov" "l" " %2,%" "" "1\n"
|
||||
"2:\n"
|
||||
".section .fixup,\"ax\"\n"
|
||||
"3: movl %3,%0\n"
|
||||
" xor" "l" " %" "" "1,%" "" "1\n"
|
||||
" jmp 2b\n"
|
||||
".section __ex_table,\"a\"\n"
|
||||
" .align 4\n" " .long 1b,3b\n"
|
||||
".text" : "=r"(__gu_err), "=r" (__gu_val) : "m"((*(struct __large_struct *)
|
||||
( __gu_addr )) ), "i"(- 14 ), "0"(__gu_err));
|
||||
break;
|
||||
default:
|
||||
(__gu_val) = __get_user_bad();
|
||||
}
|
||||
} while (0) ;
|
||||
((c)) = (__typeof__(*((buf))))__gu_val;
|
||||
__gu_err;
|
||||
}
|
||||
);
|
||||
|
||||
WOW! Black GCC/assembly magic. This is impossible to follow, so let's
|
||||
see what code gcc generates:
|
||||
see what code gcc generates::
|
||||
|
||||
> xorl %edx,%edx
|
||||
> movl current_set,%eax
|
||||
@@ -154,7 +160,7 @@ understand. Can we? The actual user access is quite obvious. Thanks
|
||||
to the unified address space we can just access the address in user
|
||||
memory. But what does the .section stuff do?????
|
||||
|
||||
To understand this we have to look at the final kernel:
|
||||
To understand this we have to look at the final kernel::
|
||||
|
||||
> objdump --section-headers vmlinux
|
||||
>
|
||||
@@ -181,7 +187,7 @@ To understand this we have to look at the final kernel:
|
||||
|
||||
There are obviously 2 non standard ELF sections in the generated object
|
||||
file. But first we want to find out what happened to our code in the
|
||||
final kernel executable:
|
||||
final kernel executable::
|
||||
|
||||
> objdump --disassemble --section=.text vmlinux
|
||||
>
|
||||
@@ -199,7 +205,7 @@ final kernel executable:
|
||||
The whole user memory access is reduced to 10 x86 machine instructions.
|
||||
The instructions bracketed in the .section directives are no longer
|
||||
in the normal execution path. They are located in a different section
|
||||
of the executable file:
|
||||
of the executable file::
|
||||
|
||||
> objdump --disassemble --section=.fixup vmlinux
|
||||
>
|
||||
@@ -207,14 +213,15 @@ of the executable file:
|
||||
> c0199ffa <.fixup+10ba> xorb %dl,%dl
|
||||
> c0199ffc <.fixup+10bc> jmp c017e7a7 <do_con_write+e3>
|
||||
|
||||
And finally:
|
||||
And finally::
|
||||
|
||||
> objdump --full-contents --section=__ex_table vmlinux
|
||||
>
|
||||
> c01aa7c4 93c017c0 e09f19c0 97c017c0 99c017c0 ................
|
||||
> c01aa7d4 f6c217c0 e99f19c0 a5e717c0 f59f19c0 ................
|
||||
> c01aa7e4 080a18c0 01a019c0 0a0a18c0 04a019c0 ................
|
||||
|
||||
or in human readable byte order:
|
||||
or in human readable byte order::
|
||||
|
||||
> c01aa7c4 c017c093 c0199fe0 c017c097 c017c099 ................
|
||||
> c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................
|
||||
@@ -222,18 +229,22 @@ or in human readable byte order:
|
||||
this is the interesting part!
|
||||
> c01aa7e4 c0180a08 c019a001 c0180a0a c019a004 ................
|
||||
|
||||
What happened? The assembly directives
|
||||
What happened? The assembly directives::
|
||||
|
||||
.section .fixup,"ax"
|
||||
.section __ex_table,"a"
|
||||
.section .fixup,"ax"
|
||||
.section __ex_table,"a"
|
||||
|
||||
told the assembler to move the following code to the specified
|
||||
sections in the ELF object file. So the instructions
|
||||
3: movl $-14,%eax
|
||||
xorb %dl,%dl
|
||||
jmp 2b
|
||||
ended up in the .fixup section of the object file and the addresses
|
||||
sections in the ELF object file. So the instructions::
|
||||
|
||||
3: movl $-14,%eax
|
||||
xorb %dl,%dl
|
||||
jmp 2b
|
||||
|
||||
ended up in the .fixup section of the object file and the addresses::
|
||||
|
||||
.long 1b,3b
|
||||
|
||||
ended up in the __ex_table section of the object file. 1b and 3b
|
||||
are local labels. The local label 1b (1b stands for next label 1
|
||||
backward) is the address of the instruction that might fault, i.e.
|
||||
@@ -246,35 +257,39 @@ the fault, in our case the actual value is c0199ff5:
|
||||
the original assembly code: > 3: movl $-14,%eax
|
||||
and linked in vmlinux : > c0199ff5 <.fixup+10b5> movl $0xfffffff2,%eax
|
||||
|
||||
The assembly code
|
||||
The assembly code::
|
||||
|
||||
> .section __ex_table,"a"
|
||||
> .align 4
|
||||
> .long 1b,3b
|
||||
|
||||
becomes the value pair
|
||||
becomes the value pair::
|
||||
|
||||
> c01aa7d4 c017c2f6 c0199fe9 c017e7a5 c0199ff5 ................
|
||||
^this is ^this is
|
||||
1b 3b
|
||||
|
||||
c017e7a5,c0199ff5 in the exception table of the kernel.
|
||||
|
||||
So, what actually happens if a fault from kernel mode with no suitable
|
||||
vma occurs?
|
||||
|
||||
1.) access to invalid address:
|
||||
> c017e7a5 <do_con_write+e1> movb (%ebx),%dl
|
||||
2.) MMU generates exception
|
||||
3.) CPU calls do_page_fault
|
||||
4.) do page fault calls search_exception_table (regs->eip == c017e7a5);
|
||||
5.) search_exception_table looks up the address c017e7a5 in the
|
||||
exception table (i.e. the contents of the ELF section __ex_table)
|
||||
and returns the address of the associated fault handle code c0199ff5.
|
||||
6.) do_page_fault modifies its own return address to point to the fault
|
||||
handle code and returns.
|
||||
7.) execution continues in the fault handling code.
|
||||
8.) 8a) EAX becomes -EFAULT (== -14)
|
||||
8b) DL becomes zero (the value we "read" from user space)
|
||||
8c) execution continues at local label 2 (address of the
|
||||
instruction immediately after the faulting user access).
|
||||
#. access to invalid address::
|
||||
|
||||
> c017e7a5 <do_con_write+e1> movb (%ebx),%dl
|
||||
#. MMU generates exception
|
||||
#. CPU calls do_page_fault
|
||||
#. do page fault calls search_exception_table (regs->eip == c017e7a5);
|
||||
#. search_exception_table looks up the address c017e7a5 in the
|
||||
exception table (i.e. the contents of the ELF section __ex_table)
|
||||
and returns the address of the associated fault handle code c0199ff5.
|
||||
#. do_page_fault modifies its own return address to point to the fault
|
||||
handle code and returns.
|
||||
#. execution continues in the fault handling code.
|
||||
#. a) EAX becomes -EFAULT (== -14)
|
||||
b) DL becomes zero (the value we "read" from user space)
|
||||
c) execution continues at local label 2 (address of the
|
||||
instruction immediately after the faulting user access).
|
||||
|
||||
The steps 8a to 8c in a certain way emulate the faulting instruction.
|
||||
|
||||
@@ -295,14 +310,15 @@ Things changed when 64-bit support was added to x86 Linux. Rather than
|
||||
double the size of the exception table by expanding the two entries
|
||||
from 32-bits to 64 bits, a clever trick was used to store addresses
|
||||
as relative offsets from the table itself. The assembly code changed
|
||||
from:
|
||||
.long 1b,3b
|
||||
to:
|
||||
.long (from) - .
|
||||
.long (to) - .
|
||||
from::
|
||||
|
||||
.long 1b,3b
|
||||
to:
|
||||
.long (from) - .
|
||||
.long (to) - .
|
||||
|
||||
and the C-code that uses these values converts back to absolute addresses
|
||||
like this:
|
||||
like this::
|
||||
|
||||
ex_insn_addr(const struct exception_table_entry *x)
|
||||
{
|
||||
@@ -313,15 +329,18 @@ In v4.6 the exception table entry was expanded with a new field "handler".
|
||||
This is also 32-bits wide and contains a third relative function
|
||||
pointer which points to one of:
|
||||
|
||||
1) int ex_handler_default(const struct exception_table_entry *fixup)
|
||||
This is legacy case that just jumps to the fixup code
|
||||
2) int ex_handler_fault(const struct exception_table_entry *fixup)
|
||||
This case provides the fault number of the trap that occurred at
|
||||
entry->insn. It is used to distinguish page faults from machine
|
||||
check.
|
||||
3) int ex_handler_ext(const struct exception_table_entry *fixup)
|
||||
This case is used for uaccess_err ... we need to set a flag
|
||||
in the task structure. Before the handler functions existed this
|
||||
case was handled by adding a large offset to the fixup to tag
|
||||
it as special.
|
||||
1) ``int ex_handler_default(const struct exception_table_entry *fixup)``
|
||||
This is legacy case that just jumps to the fixup code
|
||||
|
||||
2) ``int ex_handler_fault(const struct exception_table_entry *fixup)``
|
||||
This case provides the fault number of the trap that occurred at
|
||||
entry->insn. It is used to distinguish page faults from machine
|
||||
check.
|
||||
|
||||
3) ``int ex_handler_ext(const struct exception_table_entry *fixup)``
|
||||
This case is used for uaccess_err ... we need to set a flag
|
||||
in the task structure. Before the handler functions existed this
|
||||
case was handled by adding a large offset to the fixup to tag
|
||||
it as special.
|
||||
|
||||
More functions can easily be added.
|
||||
@@ -1,3 +1,11 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=======
|
||||
IO-APIC
|
||||
=======
|
||||
|
||||
:Author: Ingo Molnar <mingo@kernel.org>
|
||||
|
||||
Most (all) Intel-MP compliant SMP boards have the so-called 'IO-APIC',
|
||||
which is an enhanced interrupt controller. It enables us to route
|
||||
hardware interrupts to multiple CPUs, or to CPU groups. Without an
|
||||
@@ -13,9 +21,8 @@ usually worked around by the kernel. If your MP-compliant SMP board does
|
||||
not boot Linux, then consult the linux-smp mailing list archives first.
|
||||
|
||||
If your box boots fine with enabled IO-APIC IRQs, then your
|
||||
/proc/interrupts will look like this one:
|
||||
/proc/interrupts will look like this one::
|
||||
|
||||
---------------------------->
|
||||
hell:~> cat /proc/interrupts
|
||||
CPU0
|
||||
0: 1360293 IO-APIC-edge timer
|
||||
@@ -28,7 +35,6 @@ If your box boots fine with enabled IO-APIC IRQs, then your
|
||||
NMI: 0
|
||||
ERR: 0
|
||||
hell:~>
|
||||
<----------------------------
|
||||
|
||||
Some interrupts are still listed as 'XT PIC', but this is not a problem;
|
||||
none of those IRQ sources is performance-critical.
|
||||
@@ -37,14 +43,14 @@ none of those IRQ sources is performance-critical.
|
||||
In the unlikely case that your board does not create a working mp-table,
|
||||
you can use the pirq= boot parameter to 'hand-construct' IRQ entries. This
|
||||
is non-trivial though and cannot be automated. One sample /etc/lilo.conf
|
||||
entry:
|
||||
entry::
|
||||
|
||||
append="pirq=15,11,10"
|
||||
|
||||
The actual numbers depend on your system, on your PCI cards and on their
|
||||
PCI slot position. Usually PCI slots are 'daisy chained' before they are
|
||||
connected to the PCI chipset IRQ routing facility (the incoming PIRQ1-4
|
||||
lines):
|
||||
lines)::
|
||||
|
||||
,-. ,-. ,-. ,-. ,-.
|
||||
PIRQ4 ----| |-. ,-| |-. ,-| |-. ,-| |--------| |
|
||||
@@ -56,7 +62,7 @@ lines):
|
||||
PIRQ1 ----| |- `----| |- `----| |- `----| |--------| |
|
||||
`-' `-' `-' `-' `-'
|
||||
|
||||
Every PCI card emits a PCI IRQ, which can be INTA, INTB, INTC or INTD:
|
||||
Every PCI card emits a PCI IRQ, which can be INTA, INTB, INTC or INTD::
|
||||
|
||||
,-.
|
||||
INTD--| |
|
||||
@@ -78,19 +84,19 @@ to have non shared interrupts). Slot5 should be used for videocards, they
|
||||
do not use interrupts normally, thus they are not daisy chained either.
|
||||
|
||||
so if you have your SCSI card (IRQ11) in Slot1, Tulip card (IRQ9) in
|
||||
Slot2, then you'll have to specify this pirq= line:
|
||||
Slot2, then you'll have to specify this pirq= line::
|
||||
|
||||
append="pirq=11,9"
|
||||
|
||||
the following script tries to figure out such a default pirq= line from
|
||||
your PCI configuration:
|
||||
your PCI configuration::
|
||||
|
||||
echo -n pirq=; echo `scanpci | grep T_L | cut -c56-` | sed 's/ /,/g'
|
||||
|
||||
note that this script won't work if you have skipped a few slots or if your
|
||||
board does not do default daisy-chaining. (or the IO-APIC has the PIRQ pins
|
||||
connected in some strange way). E.g. if in the above case you have your SCSI
|
||||
card (IRQ11) in Slot3, and have Slot1 empty:
|
||||
card (IRQ11) in Slot3, and have Slot1 empty::
|
||||
|
||||
append="pirq=0,9,11"
|
||||
|
||||
@@ -105,7 +111,7 @@ won't function properly (e.g. if it's inserted as a module).
|
||||
If you have 2 PCI buses, then you can use up to 8 pirq values, although such
|
||||
boards tend to have a good configuration.
|
||||
|
||||
Be prepared that it might happen that you need some strange pirq line:
|
||||
Be prepared that it might happen that you need some strange pirq line::
|
||||
|
||||
append="pirq=0,0,0,0,0,0,9,11"
|
||||
|
||||
@@ -115,5 +121,3 @@ Good luck and mail to linux-smp@vger.kernel.org or
|
||||
linux-kernel@vger.kernel.org if you have any problems that are not covered
|
||||
by this document.
|
||||
|
||||
-- mingo
|
||||
|
||||
10
Documentation/x86/i386/index.rst
Normal file
10
Documentation/x86/i386/index.rst
Normal file
@@ -0,0 +1,10 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============
|
||||
i386 Support
|
||||
============
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
IO-APIC
|
||||
30
Documentation/x86/index.rst
Normal file
30
Documentation/x86/index.rst
Normal file
@@ -0,0 +1,30 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==========================
|
||||
x86-specific Documentation
|
||||
==========================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:numbered:
|
||||
|
||||
boot
|
||||
topology
|
||||
exception-tables
|
||||
kernel-stacks
|
||||
entry_64
|
||||
earlyprintk
|
||||
orc-unwinder
|
||||
zero-page
|
||||
tlb
|
||||
mtrr
|
||||
pat
|
||||
protection-keys
|
||||
intel_mpx
|
||||
amd-memory-encryption
|
||||
pti
|
||||
microcode
|
||||
resctrl_ui
|
||||
usb-legacy-support
|
||||
i386/index
|
||||
x86_64/index
|
||||
@@ -1,5 +1,11 @@
|
||||
1. Intel(R) MPX Overview
|
||||
========================
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
===========================================
|
||||
Intel(R) Memory Protection Extensions (MPX)
|
||||
===========================================
|
||||
|
||||
Intel(R) MPX Overview
|
||||
=====================
|
||||
|
||||
Intel(R) Memory Protection Extensions (Intel(R) MPX) is a new capability
|
||||
introduced into Intel Architecture. Intel MPX provides hardware features
|
||||
@@ -7,7 +13,7 @@ that can be used in conjunction with compiler changes to check memory
|
||||
references, for those references whose compile-time normal intentions are
|
||||
usurped at runtime due to buffer overflow or underflow.
|
||||
|
||||
You can tell if your CPU supports MPX by looking in /proc/cpuinfo:
|
||||
You can tell if your CPU supports MPX by looking in /proc/cpuinfo::
|
||||
|
||||
cat /proc/cpuinfo | grep ' mpx '
|
||||
|
||||
@@ -21,8 +27,8 @@ can be downloaded from
|
||||
http://software.intel.com/en-us/articles/intel-software-development-emulator
|
||||
|
||||
|
||||
2. How to get the advantage of MPX
|
||||
==================================
|
||||
How to get the advantage of MPX
|
||||
===============================
|
||||
|
||||
For MPX to work, changes are required in the kernel, binutils and compiler.
|
||||
No source changes are required for applications, just a recompile.
|
||||
@@ -84,14 +90,15 @@ Kernel MPX Code:
|
||||
is unmapped.
|
||||
|
||||
|
||||
3. How does MPX kernel code work
|
||||
================================
|
||||
How does MPX kernel code work
|
||||
=============================
|
||||
|
||||
Handling #BR faults caused by MPX
|
||||
---------------------------------
|
||||
|
||||
When MPX is enabled, there are 2 new situations that can generate
|
||||
#BR faults.
|
||||
|
||||
* new bounds tables (BT) need to be allocated to save bounds.
|
||||
* bounds violation caused by MPX instructions.
|
||||
|
||||
@@ -124,37 +131,37 @@ the kernel. It can theoretically be done completely from userspace. Here
|
||||
are a few ways this could be done. We don't think any of them are practical
|
||||
in the real-world, but here they are.
|
||||
|
||||
Q: Can virtual space simply be reserved for the bounds tables so that we
|
||||
never have to allocate them?
|
||||
A: MPX-enabled application will possibly create a lot of bounds tables in
|
||||
process address space to save bounds information. These tables can take
|
||||
up huge swaths of memory (as much as 80% of the memory on the system)
|
||||
even if we clean them up aggressively. In the worst-case scenario, the
|
||||
tables can be 4x the size of the data structure being tracked. IOW, a
|
||||
1-page structure can require 4 bounds-table pages. An X-GB virtual
|
||||
area needs 4*X GB of virtual space, plus 2GB for the bounds directory.
|
||||
If we were to preallocate them for the 128TB of user virtual address
|
||||
space, we would need to reserve 512TB+2GB, which is larger than the
|
||||
entire virtual address space today. This means they can not be reserved
|
||||
ahead of time. Also, a single process's pre-populated bounds directory
|
||||
consumes 2GB of virtual *AND* physical memory. IOW, it's completely
|
||||
infeasible to prepopulate bounds directories.
|
||||
:Q: Can virtual space simply be reserved for the bounds tables so that we
|
||||
never have to allocate them?
|
||||
:A: MPX-enabled application will possibly create a lot of bounds tables in
|
||||
process address space to save bounds information. These tables can take
|
||||
up huge swaths of memory (as much as 80% of the memory on the system)
|
||||
even if we clean them up aggressively. In the worst-case scenario, the
|
||||
tables can be 4x the size of the data structure being tracked. IOW, a
|
||||
1-page structure can require 4 bounds-table pages. An X-GB virtual
|
||||
area needs 4*X GB of virtual space, plus 2GB for the bounds directory.
|
||||
If we were to preallocate them for the 128TB of user virtual address
|
||||
space, we would need to reserve 512TB+2GB, which is larger than the
|
||||
entire virtual address space today. This means they can not be reserved
|
||||
ahead of time. Also, a single process's pre-populated bounds directory
|
||||
consumes 2GB of virtual *AND* physical memory. IOW, it's completely
|
||||
infeasible to prepopulate bounds directories.
|
||||
|
||||
Q: Can we preallocate bounds table space at the same time memory is
|
||||
allocated which might contain pointers that might eventually need
|
||||
bounds tables?
|
||||
A: This would work if we could hook the site of each and every memory
|
||||
allocation syscall. This can be done for small, constrained applications.
|
||||
But, it isn't practical at a larger scale since a given app has no
|
||||
way of controlling how all the parts of the app might allocate memory
|
||||
(think libraries). The kernel is really the only place to intercept
|
||||
these calls.
|
||||
:Q: Can we preallocate bounds table space at the same time memory is
|
||||
allocated which might contain pointers that might eventually need
|
||||
bounds tables?
|
||||
:A: This would work if we could hook the site of each and every memory
|
||||
allocation syscall. This can be done for small, constrained applications.
|
||||
But, it isn't practical at a larger scale since a given app has no
|
||||
way of controlling how all the parts of the app might allocate memory
|
||||
(think libraries). The kernel is really the only place to intercept
|
||||
these calls.
|
||||
|
||||
Q: Could a bounds fault be handed to userspace and the tables allocated
|
||||
there in a signal handler instead of in the kernel?
|
||||
A: mmap() is not on the list of safe async handler functions and even
|
||||
if mmap() would work it still requires locking or nasty tricks to
|
||||
keep track of the allocation state there.
|
||||
:Q: Could a bounds fault be handed to userspace and the tables allocated
|
||||
there in a signal handler instead of in the kernel?
|
||||
:A: mmap() is not on the list of safe async handler functions and even
|
||||
if mmap() would work it still requires locking or nasty tricks to
|
||||
keep track of the allocation state there.
|
||||
|
||||
Having ruled out all of the userspace-only approaches for managing
|
||||
bounds tables that we could think of, we create them on demand in
|
||||
@@ -167,20 +174,20 @@ If a #BR is generated due to a bounds violation caused by MPX.
|
||||
We need to decode MPX instructions to get violation address and
|
||||
set this address into extended struct siginfo.
|
||||
|
||||
The _sigfault field of struct siginfo is extended as follow:
|
||||
The _sigfault field of struct siginfo is extended as follow::
|
||||
|
||||
87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
|
||||
88 struct {
|
||||
89 void __user *_addr; /* faulting insn/memory ref. */
|
||||
90 #ifdef __ARCH_SI_TRAPNO
|
||||
91 int _trapno; /* TRAP # which caused the signal */
|
||||
92 #endif
|
||||
93 short _addr_lsb; /* LSB of the reported address */
|
||||
94 struct {
|
||||
95 void __user *_lower;
|
||||
96 void __user *_upper;
|
||||
97 } _addr_bnd;
|
||||
98 } _sigfault;
|
||||
87 /* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
|
||||
88 struct {
|
||||
89 void __user *_addr; /* faulting insn/memory ref. */
|
||||
90 #ifdef __ARCH_SI_TRAPNO
|
||||
91 int _trapno; /* TRAP # which caused the signal */
|
||||
92 #endif
|
||||
93 short _addr_lsb; /* LSB of the reported address */
|
||||
94 struct {
|
||||
95 void __user *_lower;
|
||||
96 void __user *_upper;
|
||||
97 } _addr_bnd;
|
||||
98 } _sigfault;
|
||||
|
||||
The '_addr' field refers to violation address, and new '_addr_and'
|
||||
field refers to the upper/lower bounds when a #BR is caused.
|
||||
@@ -209,9 +216,10 @@ Adding new prctl commands
|
||||
|
||||
Two new prctl commands are added to enable and disable MPX bounds tables
|
||||
management in kernel.
|
||||
::
|
||||
|
||||
155 #define PR_MPX_ENABLE_MANAGEMENT 43
|
||||
156 #define PR_MPX_DISABLE_MANAGEMENT 44
|
||||
155 #define PR_MPX_ENABLE_MANAGEMENT 43
|
||||
156 #define PR_MPX_DISABLE_MANAGEMENT 44
|
||||
|
||||
Runtime library in userspace is responsible for allocation of bounds
|
||||
directory. So kernel have to use XSAVE instruction to get the base
|
||||
@@ -223,8 +231,8 @@ into struct mm_struct to be used in future during PR_MPX_ENABLE_MANAGEMENT
|
||||
command execution.
|
||||
|
||||
|
||||
4. Special rules
|
||||
================
|
||||
Special rules
|
||||
=============
|
||||
|
||||
1) If userspace is requesting help from the kernel to do the management
|
||||
of bounds tables, it may not create or modify entries in the bounds directory.
|
||||
@@ -1,5 +1,11 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=============
|
||||
Kernel Stacks
|
||||
=============
|
||||
|
||||
Kernel stacks on x86-64 bit
|
||||
---------------------------
|
||||
===========================
|
||||
|
||||
Most of the text from Keith Owens, hacked by AK
|
||||
|
||||
@@ -57,7 +63,7 @@ IST events with the same code to be nested. However in most cases, the
|
||||
stack size allocated to an IST assumes no nesting for the same code.
|
||||
If that assumption is ever broken then the stacks will become corrupt.
|
||||
|
||||
The currently assigned IST stacks are :-
|
||||
The currently assigned IST stacks are:
|
||||
|
||||
* ESTACK_DF. EXCEPTION_STKSZ (PAGE_SIZE).
|
||||
|
||||
@@ -103,7 +109,7 @@ For more details see the Intel IA32 or AMD AMD64 architecture manuals.
|
||||
|
||||
|
||||
Printing backtraces on x86
|
||||
--------------------------
|
||||
==========================
|
||||
|
||||
The question about the '?' preceding function names in an x86 stacktrace
|
||||
keeps popping up, here's an indepth explanation. It helps if the reader
|
||||
@@ -113,7 +119,7 @@ arch/x86/kernel/dumpstack.c.
|
||||
Adapted from Ingo's mail, Message-ID: <20150521101614.GA10889@gmail.com>:
|
||||
|
||||
We always scan the full kernel stack for return addresses stored on
|
||||
the kernel stack(s) [*], from stack top to stack bottom, and print out
|
||||
the kernel stack(s) [1]_, from stack top to stack bottom, and print out
|
||||
anything that 'looks like' a kernel text address.
|
||||
|
||||
If it fits into the frame pointer chain, we print it without a question
|
||||
@@ -141,6 +147,6 @@ that look like kernel text addresses, so if debug information is wrong,
|
||||
we still print out the real call chain as well - just with more question
|
||||
marks than ideal.
|
||||
|
||||
[*] For things like IRQ and IST stacks, we also scan those stacks, in
|
||||
the right order, and try to cross from one stack into another
|
||||
reconstructing the call chain. This works most of the time.
|
||||
.. [1] For things like IRQ and IST stacks, we also scan those stacks, in
|
||||
the right order, and try to cross from one stack into another
|
||||
reconstructing the call chain. This works most of the time.
|
||||
@@ -1,7 +1,11 @@
|
||||
The Linux Microcode Loader
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Authors: Fenghua Yu <fenghua.yu@intel.com>
|
||||
Borislav Petkov <bp@suse.de>
|
||||
==========================
|
||||
The Linux Microcode Loader
|
||||
==========================
|
||||
|
||||
:Authors: - Fenghua Yu <fenghua.yu@intel.com>
|
||||
- Borislav Petkov <bp@suse.de>
|
||||
|
||||
The kernel has a x86 microcode loading facility which is supposed to
|
||||
provide microcode loading methods in the OS. Potential use cases are
|
||||
@@ -10,8 +14,8 @@ and updating the microcode on long-running systems without rebooting.
|
||||
|
||||
The loader supports three loading methods:
|
||||
|
||||
1. Early load microcode
|
||||
=======================
|
||||
Early load microcode
|
||||
====================
|
||||
|
||||
The kernel can update microcode very early during boot. Loading
|
||||
microcode early can fix CPU issues before they are observed during
|
||||
@@ -26,8 +30,10 @@ loader parses the combined initrd image during boot.
|
||||
|
||||
The microcode files in cpio name space are:
|
||||
|
||||
on Intel: kernel/x86/microcode/GenuineIntel.bin
|
||||
on AMD : kernel/x86/microcode/AuthenticAMD.bin
|
||||
on Intel:
|
||||
kernel/x86/microcode/GenuineIntel.bin
|
||||
on AMD :
|
||||
kernel/x86/microcode/AuthenticAMD.bin
|
||||
|
||||
During BSP (BootStrapping Processor) boot (pre-SMP), the kernel
|
||||
scans the microcode file in the initrd. If microcode matching the
|
||||
@@ -42,8 +48,8 @@ Here's a crude example how to prepare an initrd with microcode (this is
|
||||
normally done automatically by the distribution, when recreating the
|
||||
initrd, so you don't really have to do it yourself. It is documented
|
||||
here for future reference only).
|
||||
::
|
||||
|
||||
---
|
||||
#!/bin/bash
|
||||
|
||||
if [ -z "$1" ]; then
|
||||
@@ -76,15 +82,15 @@ here for future reference only).
|
||||
cat ucode.cpio $INITRD.orig > $INITRD
|
||||
|
||||
rm -rf $TMPDIR
|
||||
---
|
||||
|
||||
|
||||
The system needs to have the microcode packages installed into
|
||||
/lib/firmware or you need to fixup the paths above if yours are
|
||||
somewhere else and/or you've downloaded them directly from the processor
|
||||
vendor's site.
|
||||
|
||||
2. Late loading
|
||||
===============
|
||||
Late loading
|
||||
============
|
||||
|
||||
There are two legacy user space interfaces to load microcode, either through
|
||||
/dev/cpu/microcode or through /sys/devices/system/cpu/microcode/reload file
|
||||
@@ -94,9 +100,9 @@ The /dev/cpu/microcode method is deprecated because it needs a special
|
||||
userspace tool for that.
|
||||
|
||||
The easier method is simply installing the microcode packages your distro
|
||||
supplies and running:
|
||||
supplies and running::
|
||||
|
||||
# echo 1 > /sys/devices/system/cpu/microcode/reload
|
||||
# echo 1 > /sys/devices/system/cpu/microcode/reload
|
||||
|
||||
as root.
|
||||
|
||||
@@ -104,29 +110,29 @@ The loading mechanism looks for microcode blobs in
|
||||
/lib/firmware/{intel-ucode,amd-ucode}. The default distro installation
|
||||
packages already put them there.
|
||||
|
||||
3. Builtin microcode
|
||||
====================
|
||||
Builtin microcode
|
||||
=================
|
||||
|
||||
The loader supports also loading of a builtin microcode supplied through
|
||||
the regular builtin firmware method CONFIG_EXTRA_FIRMWARE. Only 64-bit is
|
||||
currently supported.
|
||||
|
||||
Here's an example:
|
||||
Here's an example::
|
||||
|
||||
CONFIG_EXTRA_FIRMWARE="intel-ucode/06-3a-09 amd-ucode/microcode_amd_fam15h.bin"
|
||||
CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
|
||||
CONFIG_EXTRA_FIRMWARE="intel-ucode/06-3a-09 amd-ucode/microcode_amd_fam15h.bin"
|
||||
CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
|
||||
|
||||
This basically means, you have the following tree structure locally:
|
||||
This basically means, you have the following tree structure locally::
|
||||
|
||||
/lib/firmware/
|
||||
|-- amd-ucode
|
||||
...
|
||||
| |-- microcode_amd_fam15h.bin
|
||||
...
|
||||
|-- intel-ucode
|
||||
...
|
||||
| |-- 06-3a-09
|
||||
...
|
||||
/lib/firmware/
|
||||
|-- amd-ucode
|
||||
...
|
||||
| |-- microcode_amd_fam15h.bin
|
||||
...
|
||||
|-- intel-ucode
|
||||
...
|
||||
| |-- 06-3a-09
|
||||
...
|
||||
|
||||
so that the build system can find those files and integrate them into
|
||||
the final kernel image. The early loader finds them and applies them.
|
||||
354
Documentation/x86/mtrr.rst
Normal file
354
Documentation/x86/mtrr.rst
Normal file
@@ -0,0 +1,354 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
=========================================
|
||||
MTRR (Memory Type Range Register) control
|
||||
=========================================
|
||||
|
||||
:Authors: - Richard Gooch <rgooch@atnf.csiro.au> - 3 Jun 1999
|
||||
- Luis R. Rodriguez <mcgrof@do-not-panic.com> - April 9, 2015
|
||||
|
||||
|
||||
Phasing out MTRR use
|
||||
====================
|
||||
|
||||
MTRR use is replaced on modern x86 hardware with PAT. Direct MTRR use by
|
||||
drivers on Linux is now completely phased out, device drivers should use
|
||||
arch_phys_wc_add() in combination with ioremap_wc() to make MTRR effective on
|
||||
non-PAT systems while a no-op but equally effective on PAT enabled systems.
|
||||
|
||||
Even if Linux does not use MTRRs directly, some x86 platform firmware may still
|
||||
set up MTRRs early before booting the OS. They do this as some platform
|
||||
firmware may still have implemented access to MTRRs which would be controlled
|
||||
and handled by the platform firmware directly. An example of platform use of
|
||||
MTRRs is through the use of SMI handlers, one case could be for fan control,
|
||||
the platform code would need uncachable access to some of its fan control
|
||||
registers. Such platform access does not need any Operating System MTRR code in
|
||||
place other than mtrr_type_lookup() to ensure any OS specific mapping requests
|
||||
are aligned with platform MTRR setup. If MTRRs are only set up by the platform
|
||||
firmware code though and the OS does not make any specific MTRR mapping
|
||||
requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
|
||||
|
||||
For details refer to :doc:`pat`.
|
||||
|
||||
.. tip::
|
||||
On Intel P6 family processors (Pentium Pro, Pentium II and later)
|
||||
the Memory Type Range Registers (MTRRs) may be used to control
|
||||
processor access to memory ranges. This is most useful when you have
|
||||
a video (VGA) card on a PCI or AGP bus. Enabling write-combining
|
||||
allows bus write transfers to be combined into a larger transfer
|
||||
before bursting over the PCI/AGP bus. This can increase performance
|
||||
of image write operations 2.5 times or more.
|
||||
|
||||
The Cyrix 6x86, 6x86MX and M II processors have Address Range
|
||||
Registers (ARRs) which provide a similar functionality to MTRRs. For
|
||||
these, the ARRs are used to emulate the MTRRs.
|
||||
|
||||
The AMD K6-2 (stepping 8 and above) and K6-3 processors have two
|
||||
MTRRs. These are supported. The AMD Athlon family provide 8 Intel
|
||||
style MTRRs.
|
||||
|
||||
The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These
|
||||
are supported.
|
||||
|
||||
The VIA Cyrix III and VIA C3 CPUs offer 8 Intel style MTRRs.
|
||||
|
||||
The CONFIG_MTRR option creates a /proc/mtrr file which may be used
|
||||
to manipulate your MTRRs. Typically the X server should use
|
||||
this. This should have a reasonably generic interface so that
|
||||
similar control registers on other processors can be easily
|
||||
supported.
|
||||
|
||||
There are two interfaces to /proc/mtrr: one is an ASCII interface
|
||||
which allows you to read and write. The other is an ioctl()
|
||||
interface. The ASCII interface is meant for administration. The
|
||||
ioctl() interface is meant for C programs (i.e. the X server). The
|
||||
interfaces are described below, with sample commands and C code.
|
||||
|
||||
|
||||
Reading MTRRs from the shell
|
||||
============================
|
||||
::
|
||||
|
||||
% cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
|
||||
reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
|
||||
|
||||
Creating MTRRs from the C-shell::
|
||||
|
||||
# echo "base=0xf8000000 size=0x400000 type=write-combining" >! /proc/mtrr
|
||||
|
||||
or if you use bash::
|
||||
|
||||
# echo "base=0xf8000000 size=0x400000 type=write-combining" >| /proc/mtrr
|
||||
|
||||
And the result thereof::
|
||||
|
||||
% cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
|
||||
reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
|
||||
reg02: base=0xf8000000 (3968MB), size= 4MB: write-combining, count=1
|
||||
|
||||
This is for video RAM at base address 0xf8000000 and size 4 megabytes. To
|
||||
find out your base address, you need to look at the output of your X
|
||||
server, which tells you where the linear framebuffer address is. A
|
||||
typical line that you may get is::
|
||||
|
||||
(--) S3: PCI: 968 rev 0, Linear FB @ 0xf8000000
|
||||
|
||||
Note that you should only use the value from the X server, as it may
|
||||
move the framebuffer base address, so the only value you can trust is
|
||||
that reported by the X server.
|
||||
|
||||
To find out the size of your framebuffer (what, you don't actually
|
||||
know?), the following line will tell you::
|
||||
|
||||
(--) S3: videoram: 4096k
|
||||
|
||||
That's 4 megabytes, which is 0x400000 bytes (in hexadecimal).
|
||||
A patch is being written for XFree86 which will make this automatic:
|
||||
in other words the X server will manipulate /proc/mtrr using the
|
||||
ioctl() interface, so users won't have to do anything. If you use a
|
||||
commercial X server, lobby your vendor to add support for MTRRs.
|
||||
|
||||
|
||||
Creating overlapping MTRRs
|
||||
==========================
|
||||
::
|
||||
|
||||
%echo "base=0xfb000000 size=0x1000000 type=write-combining" >/proc/mtrr
|
||||
%echo "base=0xfb000000 size=0x1000 type=uncachable" >/proc/mtrr
|
||||
|
||||
And the results::
|
||||
|
||||
% cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1
|
||||
reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1
|
||||
reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1
|
||||
|
||||
Some cards (especially Voodoo Graphics boards) need this 4 kB area
|
||||
excluded from the beginning of the region because it is used for
|
||||
registers.
|
||||
|
||||
NOTE: You can only create type=uncachable region, if the first
|
||||
region that you created is type=write-combining.
|
||||
|
||||
|
||||
Removing MTRRs from the C-shel
|
||||
==============================
|
||||
::
|
||||
|
||||
% echo "disable=2" >! /proc/mtrr
|
||||
|
||||
or using bash::
|
||||
|
||||
% echo "disable=2" >| /proc/mtrr
|
||||
|
||||
|
||||
Reading MTRRs from a C program using ioctl()'s
|
||||
==============================================
|
||||
::
|
||||
|
||||
/* mtrr-show.c
|
||||
|
||||
Source file for mtrr-show (example program to show MTRRs using ioctl()'s)
|
||||
|
||||
Copyright (C) 1997-1998 Richard Gooch
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Richard Gooch may be reached by email at rgooch@atnf.csiro.au
|
||||
The postal address is:
|
||||
Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
|
||||
*/
|
||||
|
||||
/*
|
||||
This program will use an ioctl() on /proc/mtrr to show the current MTRR
|
||||
settings. This is an alternative to reading /proc/mtrr.
|
||||
|
||||
|
||||
Written by Richard Gooch 17-DEC-1997
|
||||
|
||||
Last updated by Richard Gooch 2-MAY-1998
|
||||
|
||||
|
||||
*/
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
#include <fcntl.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <errno.h>
|
||||
#include <asm/mtrr.h>
|
||||
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
#define ERRSTRING strerror (errno)
|
||||
|
||||
static char *mtrr_strings[MTRR_NUM_TYPES] =
|
||||
{
|
||||
"uncachable", /* 0 */
|
||||
"write-combining", /* 1 */
|
||||
"?", /* 2 */
|
||||
"?", /* 3 */
|
||||
"write-through", /* 4 */
|
||||
"write-protect", /* 5 */
|
||||
"write-back", /* 6 */
|
||||
};
|
||||
|
||||
int main ()
|
||||
{
|
||||
int fd;
|
||||
struct mtrr_gentry gentry;
|
||||
|
||||
if ( ( fd = open ("/proc/mtrr", O_RDONLY, 0) ) == -1 )
|
||||
{
|
||||
if (errno == ENOENT)
|
||||
{
|
||||
fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
|
||||
stderr);
|
||||
exit (1);
|
||||
}
|
||||
fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
|
||||
exit (2);
|
||||
}
|
||||
for (gentry.regnum = 0; ioctl (fd, MTRRIOC_GET_ENTRY, &gentry) == 0;
|
||||
++gentry.regnum)
|
||||
{
|
||||
if (gentry.size < 1)
|
||||
{
|
||||
fprintf (stderr, "Register: %u disabled\n", gentry.regnum);
|
||||
continue;
|
||||
}
|
||||
fprintf (stderr, "Register: %u base: 0x%lx size: 0x%lx type: %s\n",
|
||||
gentry.regnum, gentry.base, gentry.size,
|
||||
mtrr_strings[gentry.type]);
|
||||
}
|
||||
if (errno == EINVAL) exit (0);
|
||||
fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
|
||||
exit (3);
|
||||
} /* End Function main */
|
||||
|
||||
|
||||
Creating MTRRs from a C programme using ioctl()'s
|
||||
=================================================
|
||||
::
|
||||
|
||||
/* mtrr-add.c
|
||||
|
||||
Source file for mtrr-add (example programme to add an MTRRs using ioctl())
|
||||
|
||||
Copyright (C) 1997-1998 Richard Gooch
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Richard Gooch may be reached by email at rgooch@atnf.csiro.au
|
||||
The postal address is:
|
||||
Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
|
||||
*/
|
||||
|
||||
/*
|
||||
This programme will use an ioctl() on /proc/mtrr to add an entry. The first
|
||||
available mtrr is used. This is an alternative to writing /proc/mtrr.
|
||||
|
||||
|
||||
Written by Richard Gooch 17-DEC-1997
|
||||
|
||||
Last updated by Richard Gooch 2-MAY-1998
|
||||
|
||||
|
||||
*/
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <stdlib.h>
|
||||
#include <unistd.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
#include <fcntl.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <errno.h>
|
||||
#include <asm/mtrr.h>
|
||||
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
#define ERRSTRING strerror (errno)
|
||||
|
||||
static char *mtrr_strings[MTRR_NUM_TYPES] =
|
||||
{
|
||||
"uncachable", /* 0 */
|
||||
"write-combining", /* 1 */
|
||||
"?", /* 2 */
|
||||
"?", /* 3 */
|
||||
"write-through", /* 4 */
|
||||
"write-protect", /* 5 */
|
||||
"write-back", /* 6 */
|
||||
};
|
||||
|
||||
int main (int argc, char **argv)
|
||||
{
|
||||
int fd;
|
||||
struct mtrr_sentry sentry;
|
||||
|
||||
if (argc != 4)
|
||||
{
|
||||
fprintf (stderr, "Usage:\tmtrr-add base size type\n");
|
||||
exit (1);
|
||||
}
|
||||
sentry.base = strtoul (argv[1], NULL, 0);
|
||||
sentry.size = strtoul (argv[2], NULL, 0);
|
||||
for (sentry.type = 0; sentry.type < MTRR_NUM_TYPES; ++sentry.type)
|
||||
{
|
||||
if (strcmp (argv[3], mtrr_strings[sentry.type]) == 0) break;
|
||||
}
|
||||
if (sentry.type >= MTRR_NUM_TYPES)
|
||||
{
|
||||
fprintf (stderr, "Illegal type: \"%s\"\n", argv[3]);
|
||||
exit (2);
|
||||
}
|
||||
if ( ( fd = open ("/proc/mtrr", O_WRONLY, 0) ) == -1 )
|
||||
{
|
||||
if (errno == ENOENT)
|
||||
{
|
||||
fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
|
||||
stderr);
|
||||
exit (3);
|
||||
}
|
||||
fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
|
||||
exit (4);
|
||||
}
|
||||
if (ioctl (fd, MTRRIOC_ADD_ENTRY, &sentry) == -1)
|
||||
{
|
||||
fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
|
||||
exit (5);
|
||||
}
|
||||
fprintf (stderr, "Sleeping for 5 seconds so you can see the new entry\n");
|
||||
sleep (5);
|
||||
close (fd);
|
||||
fputs ("I've just closed /proc/mtrr so now the new entry should be gone\n",
|
||||
stderr);
|
||||
} /* End Function main */
|
||||
@@ -1,329 +0,0 @@
|
||||
MTRR (Memory Type Range Register) control
|
||||
|
||||
Richard Gooch <rgooch@atnf.csiro.au> - 3 Jun 1999
|
||||
Luis R. Rodriguez <mcgrof@do-not-panic.com> - April 9, 2015
|
||||
|
||||
===============================================================================
|
||||
Phasing out MTRR use
|
||||
|
||||
MTRR use is replaced on modern x86 hardware with PAT. Direct MTRR use by
|
||||
drivers on Linux is now completely phased out, device drivers should use
|
||||
arch_phys_wc_add() in combination with ioremap_wc() to make MTRR effective on
|
||||
non-PAT systems while a no-op but equally effective on PAT enabled systems.
|
||||
|
||||
Even if Linux does not use MTRRs directly, some x86 platform firmware may still
|
||||
set up MTRRs early before booting the OS. They do this as some platform
|
||||
firmware may still have implemented access to MTRRs which would be controlled
|
||||
and handled by the platform firmware directly. An example of platform use of
|
||||
MTRRs is through the use of SMI handlers, one case could be for fan control,
|
||||
the platform code would need uncachable access to some of its fan control
|
||||
registers. Such platform access does not need any Operating System MTRR code in
|
||||
place other than mtrr_type_lookup() to ensure any OS specific mapping requests
|
||||
are aligned with platform MTRR setup. If MTRRs are only set up by the platform
|
||||
firmware code though and the OS does not make any specific MTRR mapping
|
||||
requests mtrr_type_lookup() should always return MTRR_TYPE_INVALID.
|
||||
|
||||
For details refer to Documentation/x86/pat.txt.
|
||||
|
||||
===============================================================================
|
||||
|
||||
On Intel P6 family processors (Pentium Pro, Pentium II and later)
|
||||
the Memory Type Range Registers (MTRRs) may be used to control
|
||||
processor access to memory ranges. This is most useful when you have
|
||||
a video (VGA) card on a PCI or AGP bus. Enabling write-combining
|
||||
allows bus write transfers to be combined into a larger transfer
|
||||
before bursting over the PCI/AGP bus. This can increase performance
|
||||
of image write operations 2.5 times or more.
|
||||
|
||||
The Cyrix 6x86, 6x86MX and M II processors have Address Range
|
||||
Registers (ARRs) which provide a similar functionality to MTRRs. For
|
||||
these, the ARRs are used to emulate the MTRRs.
|
||||
|
||||
The AMD K6-2 (stepping 8 and above) and K6-3 processors have two
|
||||
MTRRs. These are supported. The AMD Athlon family provide 8 Intel
|
||||
style MTRRs.
|
||||
|
||||
The Centaur C6 (WinChip) has 8 MCRs, allowing write-combining. These
|
||||
are supported.
|
||||
|
||||
The VIA Cyrix III and VIA C3 CPUs offer 8 Intel style MTRRs.
|
||||
|
||||
The CONFIG_MTRR option creates a /proc/mtrr file which may be used
|
||||
to manipulate your MTRRs. Typically the X server should use
|
||||
this. This should have a reasonably generic interface so that
|
||||
similar control registers on other processors can be easily
|
||||
supported.
|
||||
|
||||
|
||||
There are two interfaces to /proc/mtrr: one is an ASCII interface
|
||||
which allows you to read and write. The other is an ioctl()
|
||||
interface. The ASCII interface is meant for administration. The
|
||||
ioctl() interface is meant for C programs (i.e. the X server). The
|
||||
interfaces are described below, with sample commands and C code.
|
||||
|
||||
===============================================================================
|
||||
Reading MTRRs from the shell:
|
||||
|
||||
% cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
|
||||
reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
|
||||
===============================================================================
|
||||
Creating MTRRs from the C-shell:
|
||||
# echo "base=0xf8000000 size=0x400000 type=write-combining" >! /proc/mtrr
|
||||
or if you use bash:
|
||||
# echo "base=0xf8000000 size=0x400000 type=write-combining" >| /proc/mtrr
|
||||
|
||||
And the result thereof:
|
||||
% cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 128MB: write-back, count=1
|
||||
reg01: base=0x08000000 ( 128MB), size= 64MB: write-back, count=1
|
||||
reg02: base=0xf8000000 (3968MB), size= 4MB: write-combining, count=1
|
||||
|
||||
This is for video RAM at base address 0xf8000000 and size 4 megabytes. To
|
||||
find out your base address, you need to look at the output of your X
|
||||
server, which tells you where the linear framebuffer address is. A
|
||||
typical line that you may get is:
|
||||
|
||||
(--) S3: PCI: 968 rev 0, Linear FB @ 0xf8000000
|
||||
|
||||
Note that you should only use the value from the X server, as it may
|
||||
move the framebuffer base address, so the only value you can trust is
|
||||
that reported by the X server.
|
||||
|
||||
To find out the size of your framebuffer (what, you don't actually
|
||||
know?), the following line will tell you:
|
||||
|
||||
(--) S3: videoram: 4096k
|
||||
|
||||
That's 4 megabytes, which is 0x400000 bytes (in hexadecimal).
|
||||
A patch is being written for XFree86 which will make this automatic:
|
||||
in other words the X server will manipulate /proc/mtrr using the
|
||||
ioctl() interface, so users won't have to do anything. If you use a
|
||||
commercial X server, lobby your vendor to add support for MTRRs.
|
||||
===============================================================================
|
||||
Creating overlapping MTRRs:
|
||||
|
||||
%echo "base=0xfb000000 size=0x1000000 type=write-combining" >/proc/mtrr
|
||||
%echo "base=0xfb000000 size=0x1000 type=uncachable" >/proc/mtrr
|
||||
|
||||
And the results: cat /proc/mtrr
|
||||
reg00: base=0x00000000 ( 0MB), size= 64MB: write-back, count=1
|
||||
reg01: base=0xfb000000 (4016MB), size= 16MB: write-combining, count=1
|
||||
reg02: base=0xfb000000 (4016MB), size= 4kB: uncachable, count=1
|
||||
|
||||
Some cards (especially Voodoo Graphics boards) need this 4 kB area
|
||||
excluded from the beginning of the region because it is used for
|
||||
registers.
|
||||
|
||||
NOTE: You can only create type=uncachable region, if the first
|
||||
region that you created is type=write-combining.
|
||||
===============================================================================
|
||||
Removing MTRRs from the C-shell:
|
||||
% echo "disable=2" >! /proc/mtrr
|
||||
or using bash:
|
||||
% echo "disable=2" >| /proc/mtrr
|
||||
===============================================================================
|
||||
Reading MTRRs from a C program using ioctl()'s:
|
||||
|
||||
/* mtrr-show.c
|
||||
|
||||
Source file for mtrr-show (example program to show MTRRs using ioctl()'s)
|
||||
|
||||
Copyright (C) 1997-1998 Richard Gooch
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Richard Gooch may be reached by email at rgooch@atnf.csiro.au
|
||||
The postal address is:
|
||||
Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
|
||||
*/
|
||||
|
||||
/*
|
||||
This program will use an ioctl() on /proc/mtrr to show the current MTRR
|
||||
settings. This is an alternative to reading /proc/mtrr.
|
||||
|
||||
|
||||
Written by Richard Gooch 17-DEC-1997
|
||||
|
||||
Last updated by Richard Gooch 2-MAY-1998
|
||||
|
||||
|
||||
*/
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
#include <fcntl.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <errno.h>
|
||||
#include <asm/mtrr.h>
|
||||
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
#define ERRSTRING strerror (errno)
|
||||
|
||||
static char *mtrr_strings[MTRR_NUM_TYPES] =
|
||||
{
|
||||
"uncachable", /* 0 */
|
||||
"write-combining", /* 1 */
|
||||
"?", /* 2 */
|
||||
"?", /* 3 */
|
||||
"write-through", /* 4 */
|
||||
"write-protect", /* 5 */
|
||||
"write-back", /* 6 */
|
||||
};
|
||||
|
||||
int main ()
|
||||
{
|
||||
int fd;
|
||||
struct mtrr_gentry gentry;
|
||||
|
||||
if ( ( fd = open ("/proc/mtrr", O_RDONLY, 0) ) == -1 )
|
||||
{
|
||||
if (errno == ENOENT)
|
||||
{
|
||||
fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
|
||||
stderr);
|
||||
exit (1);
|
||||
}
|
||||
fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
|
||||
exit (2);
|
||||
}
|
||||
for (gentry.regnum = 0; ioctl (fd, MTRRIOC_GET_ENTRY, &gentry) == 0;
|
||||
++gentry.regnum)
|
||||
{
|
||||
if (gentry.size < 1)
|
||||
{
|
||||
fprintf (stderr, "Register: %u disabled\n", gentry.regnum);
|
||||
continue;
|
||||
}
|
||||
fprintf (stderr, "Register: %u base: 0x%lx size: 0x%lx type: %s\n",
|
||||
gentry.regnum, gentry.base, gentry.size,
|
||||
mtrr_strings[gentry.type]);
|
||||
}
|
||||
if (errno == EINVAL) exit (0);
|
||||
fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
|
||||
exit (3);
|
||||
} /* End Function main */
|
||||
===============================================================================
|
||||
Creating MTRRs from a C programme using ioctl()'s:
|
||||
|
||||
/* mtrr-add.c
|
||||
|
||||
Source file for mtrr-add (example programme to add an MTRRs using ioctl())
|
||||
|
||||
Copyright (C) 1997-1998 Richard Gooch
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Richard Gooch may be reached by email at rgooch@atnf.csiro.au
|
||||
The postal address is:
|
||||
Richard Gooch, c/o ATNF, P. O. Box 76, Epping, N.S.W., 2121, Australia.
|
||||
*/
|
||||
|
||||
/*
|
||||
This programme will use an ioctl() on /proc/mtrr to add an entry. The first
|
||||
available mtrr is used. This is an alternative to writing /proc/mtrr.
|
||||
|
||||
|
||||
Written by Richard Gooch 17-DEC-1997
|
||||
|
||||
Last updated by Richard Gooch 2-MAY-1998
|
||||
|
||||
|
||||
*/
|
||||
#include <stdio.h>
|
||||
#include <string.h>
|
||||
#include <stdlib.h>
|
||||
#include <unistd.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/stat.h>
|
||||
#include <fcntl.h>
|
||||
#include <sys/ioctl.h>
|
||||
#include <errno.h>
|
||||
#include <asm/mtrr.h>
|
||||
|
||||
#define TRUE 1
|
||||
#define FALSE 0
|
||||
#define ERRSTRING strerror (errno)
|
||||
|
||||
static char *mtrr_strings[MTRR_NUM_TYPES] =
|
||||
{
|
||||
"uncachable", /* 0 */
|
||||
"write-combining", /* 1 */
|
||||
"?", /* 2 */
|
||||
"?", /* 3 */
|
||||
"write-through", /* 4 */
|
||||
"write-protect", /* 5 */
|
||||
"write-back", /* 6 */
|
||||
};
|
||||
|
||||
int main (int argc, char **argv)
|
||||
{
|
||||
int fd;
|
||||
struct mtrr_sentry sentry;
|
||||
|
||||
if (argc != 4)
|
||||
{
|
||||
fprintf (stderr, "Usage:\tmtrr-add base size type\n");
|
||||
exit (1);
|
||||
}
|
||||
sentry.base = strtoul (argv[1], NULL, 0);
|
||||
sentry.size = strtoul (argv[2], NULL, 0);
|
||||
for (sentry.type = 0; sentry.type < MTRR_NUM_TYPES; ++sentry.type)
|
||||
{
|
||||
if (strcmp (argv[3], mtrr_strings[sentry.type]) == 0) break;
|
||||
}
|
||||
if (sentry.type >= MTRR_NUM_TYPES)
|
||||
{
|
||||
fprintf (stderr, "Illegal type: \"%s\"\n", argv[3]);
|
||||
exit (2);
|
||||
}
|
||||
if ( ( fd = open ("/proc/mtrr", O_WRONLY, 0) ) == -1 )
|
||||
{
|
||||
if (errno == ENOENT)
|
||||
{
|
||||
fputs ("/proc/mtrr not found: not supported or you don't have a PPro?\n",
|
||||
stderr);
|
||||
exit (3);
|
||||
}
|
||||
fprintf (stderr, "Error opening /proc/mtrr\t%s\n", ERRSTRING);
|
||||
exit (4);
|
||||
}
|
||||
if (ioctl (fd, MTRRIOC_ADD_ENTRY, &sentry) == -1)
|
||||
{
|
||||
fprintf (stderr, "Error doing ioctl(2) on /dev/mtrr\t%s\n", ERRSTRING);
|
||||
exit (5);
|
||||
}
|
||||
fprintf (stderr, "Sleeping for 5 seconds so you can see the new entry\n");
|
||||
sleep (5);
|
||||
close (fd);
|
||||
fputs ("I've just closed /proc/mtrr so now the new entry should be gone\n",
|
||||
stderr);
|
||||
} /* End Function main */
|
||||
===============================================================================
|
||||
@@ -1,8 +1,11 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
============
|
||||
ORC unwinder
|
||||
============
|
||||
|
||||
Overview
|
||||
--------
|
||||
========
|
||||
|
||||
The kernel CONFIG_UNWINDER_ORC option enables the ORC unwinder, which is
|
||||
similar in concept to a DWARF unwinder. The difference is that the
|
||||
@@ -23,12 +26,12 @@ correlate instruction addresses with their stack states at run time.
|
||||
|
||||
|
||||
ORC vs frame pointers
|
||||
---------------------
|
||||
=====================
|
||||
|
||||
With frame pointers enabled, GCC adds instrumentation code to every
|
||||
function in the kernel. The kernel's .text size increases by about
|
||||
3.2%, resulting in a broad kernel-wide slowdown. Measurements by Mel
|
||||
Gorman [1] have shown a slowdown of 5-10% for some workloads.
|
||||
Gorman [1]_ have shown a slowdown of 5-10% for some workloads.
|
||||
|
||||
In contrast, the ORC unwinder has no effect on text size or runtime
|
||||
performance, because the debuginfo is out of band. So if you disable
|
||||
@@ -55,7 +58,7 @@ depending on the kernel config.
|
||||
|
||||
|
||||
ORC vs DWARF
|
||||
------------
|
||||
============
|
||||
|
||||
ORC debuginfo's advantage over DWARF itself is that it's much simpler.
|
||||
It gets rid of the complex DWARF CFI state machine and also gets rid of
|
||||
@@ -65,7 +68,7 @@ mission critical oops code.
|
||||
|
||||
The simpler debuginfo format also enables the unwinder to be much faster
|
||||
than DWARF, which is important for perf and lockdep. In a basic
|
||||
performance test by Jiri Slaby [2], the ORC unwinder was about 20x
|
||||
performance test by Jiri Slaby [2]_, the ORC unwinder was about 20x
|
||||
faster than an out-of-tree DWARF unwinder. (Note: That measurement was
|
||||
taken before some performance tweaks were added, which doubled
|
||||
performance, so the speedup over DWARF may be closer to 40x.)
|
||||
@@ -85,7 +88,7 @@ still be able to control the format, e.g. no complex state machines.
|
||||
|
||||
|
||||
ORC unwind table generation
|
||||
---------------------------
|
||||
===========================
|
||||
|
||||
The ORC data is generated by objtool. With the existing compile-time
|
||||
stack metadata validation feature, objtool already follows all code
|
||||
@@ -133,7 +136,7 @@ objtool follows GCC code quite well.
|
||||
|
||||
|
||||
Unwinder implementation details
|
||||
-------------------------------
|
||||
===============================
|
||||
|
||||
Objtool generates the ORC data by integrating with the compile-time
|
||||
stack metadata validation feature, which is described in detail in
|
||||
@@ -154,7 +157,7 @@ subset of the table needs to be searched.
|
||||
|
||||
|
||||
Etymology
|
||||
---------
|
||||
=========
|
||||
|
||||
Orcs, fearsome creatures of medieval folklore, are the Dwarves' natural
|
||||
enemies. Similarly, the ORC unwinder was created in opposition to the
|
||||
@@ -162,7 +165,7 @@ complexity and slowness of DWARF.
|
||||
|
||||
"Although Orcs rarely consider multiple solutions to a problem, they do
|
||||
excel at getting things done because they are creatures of action, not
|
||||
thought." [3] Similarly, unlike the esoteric DWARF unwinder, the
|
||||
thought." [3]_ Similarly, unlike the esoteric DWARF unwinder, the
|
||||
veracious ORC unwinder wastes no time or siloconic effort decoding
|
||||
variable-length zero-extended unsigned-integer byte-coded
|
||||
state-machine-based debug information entries.
|
||||
@@ -174,6 +177,6 @@ brutal, unyielding efficiency.
|
||||
ORC stands for Oops Rewind Capability.
|
||||
|
||||
|
||||
[1] https://lkml.kernel.org/r/20170602104048.jkkzssljsompjdwy@suse.de
|
||||
[2] https://lkml.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz
|
||||
[3] http://dustin.wikidot.com/half-orcs-and-orcs
|
||||
.. [1] https://lkml.kernel.org/r/20170602104048.jkkzssljsompjdwy@suse.de
|
||||
.. [2] https://lkml.kernel.org/r/d2ca5435-6386-29b8-db87-7f227c2b713a@suse.cz
|
||||
.. [3] http://dustin.wikidot.com/half-orcs-and-orcs
|
||||
242
Documentation/x86/pat.rst
Normal file
242
Documentation/x86/pat.rst
Normal file
@@ -0,0 +1,242 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
==========================
|
||||
PAT (Page Attribute Table)
|
||||
==========================
|
||||
|
||||
x86 Page Attribute Table (PAT) allows for setting the memory attribute at the
|
||||
page level granularity. PAT is complementary to the MTRR settings which allows
|
||||
for setting of memory types over physical address ranges. However, PAT is
|
||||
more flexible than MTRR due to its capability to set attributes at page level
|
||||
and also due to the fact that there are no hardware limitations on number of
|
||||
such attribute settings allowed. Added flexibility comes with guidelines for
|
||||
not having memory type aliasing for the same physical memory with multiple
|
||||
virtual addresses.
|
||||
|
||||
PAT allows for different types of memory attributes. The most commonly used
|
||||
ones that will be supported at this time are:
|
||||
|
||||
=== ==============
|
||||
WB Write-back
|
||||
UC Uncached
|
||||
WC Write-combined
|
||||
WT Write-through
|
||||
UC- Uncached Minus
|
||||
=== ==============
|
||||
|
||||
|
||||
PAT APIs
|
||||
========
|
||||
|
||||
There are many different APIs in the kernel that allows setting of memory
|
||||
attributes at the page level. In order to avoid aliasing, these interfaces
|
||||
should be used thoughtfully. Below is a table of interfaces available,
|
||||
their intended usage and their memory attribute relationships. Internally,
|
||||
these APIs use a reserve_memtype()/free_memtype() interface on the physical
|
||||
address range to avoid any aliasing.
|
||||
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| API | RAM | ACPI,... | Reserved/Holes |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap | -- | UC- | UC- |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap_cache | -- | WB | WB |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap_uc | -- | UC | UC |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap_nocache | -- | UC- | UC- |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap_wc | -- | -- | WC |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| ioremap_wt | -- | -- | WT |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| set_memory_uc, | UC- | -- | -- |
|
||||
| set_memory_wb | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| set_memory_wc, | WC | -- | -- |
|
||||
| set_memory_wb | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| set_memory_wt, | WT | -- | -- |
|
||||
| set_memory_wb | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| pci sysfs resource | -- | -- | UC- |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| pci sysfs resource_wc | -- | -- | WC |
|
||||
| is IORESOURCE_PREFETCH | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| pci proc | -- | -- | UC- |
|
||||
| !PCIIOC_WRITE_COMBINE | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| pci proc | -- | -- | WC |
|
||||
| PCIIOC_WRITE_COMBINE | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| /dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
||||
| read-write | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| /dev/mem | -- | UC- | UC- |
|
||||
| mmap SYNC flag | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| /dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
||||
| mmap !SYNC flag | | | |
|
||||
| and | |(from existing| (from existing |
|
||||
| any alias to this area | |alias) | alias) |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| /dev/mem | -- | WB | WB |
|
||||
| mmap !SYNC flag | | | |
|
||||
| no alias to this area | | | |
|
||||
| and | | | |
|
||||
| MTRR says WB | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
| /dev/mem | -- | -- | UC- |
|
||||
| mmap !SYNC flag | | | |
|
||||
| no alias to this area | | | |
|
||||
| and | | | |
|
||||
| MTRR says !WB | | | |
|
||||
+------------------------+----------+--------------+------------------+
|
||||
|
||||
|
||||
Advanced APIs for drivers
|
||||
=========================
|
||||
|
||||
A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
|
||||
vmf_insert_pfn.
|
||||
|
||||
Drivers wanting to export some pages to userspace do it by using mmap
|
||||
interface and a combination of:
|
||||
|
||||
1) pgprot_noncached()
|
||||
2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
|
||||
|
||||
With PAT support, a new API pgprot_writecombine is being added. So, drivers can
|
||||
continue to use the above sequence, with either pgprot_noncached() or
|
||||
pgprot_writecombine() in step 1, followed by step 2.
|
||||
|
||||
In addition, step 2 internally tracks the region as UC or WC in memtype
|
||||
list in order to ensure no conflicting mapping.
|
||||
|
||||
Note that this set of APIs only works with IO (non RAM) regions. If driver
|
||||
wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
|
||||
as step 0 above and also track the usage of those pages and use set_memory_wb()
|
||||
before the page is freed to free pool.
|
||||
|
||||
MTRR effects on PAT / non-PAT systems
|
||||
=====================================
|
||||
|
||||
The following table provides the effects of using write-combining MTRRs when
|
||||
using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
|
||||
mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will
|
||||
be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
|
||||
is made, should already have been ioremapped with WC attributes or PAT entries,
|
||||
this can be done by using ioremap_wc() / set_memory_wc(). Devices which
|
||||
combine areas of IO memory desired to remain uncacheable with areas where
|
||||
write-combining is desirable should consider use of ioremap_uc() followed by
|
||||
set_memory_wc() to white-list effective write-combined areas. Such use is
|
||||
nevertheless discouraged as the effective memory type is considered
|
||||
implementation defined, yet this strategy can be used as last resort on devices
|
||||
with size-constrained regions where otherwise MTRR write-combining would
|
||||
otherwise not be effective.
|
||||
::
|
||||
|
||||
==== ======= === ========================= =====================
|
||||
MTRR Non-PAT PAT Linux ioremap value Effective memory type
|
||||
==== ======= === ========================= =====================
|
||||
PAT Non-PAT | PAT
|
||||
|PCD |
|
||||
||PWT |
|
||||
||| |
|
||||
WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
|
||||
WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
|
||||
WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
|
||||
WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
|
||||
==== ======= === ========================= =====================
|
||||
|
||||
(*) denotes implementation defined and is discouraged
|
||||
|
||||
.. note:: -- in the above table mean "Not suggested usage for the API". Some
|
||||
of the --'s are strictly enforced by the kernel. Some others are not really
|
||||
enforced today, but may be enforced in future.
|
||||
|
||||
For ioremap and pci access through /sys or /proc - The actual type returned
|
||||
can be more restrictive, in case of any existing aliasing for that address.
|
||||
For example: If there is an existing uncached mapping, a new ioremap_wc can
|
||||
return uncached mapping in place of write-combine requested.
|
||||
|
||||
set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
|
||||
will first make a region uc, wc or wt and switch it back to wb after use.
|
||||
|
||||
Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
|
||||
interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
|
||||
|
||||
Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
|
||||
types.
|
||||
|
||||
Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
|
||||
|
||||
|
||||
PAT debugging
|
||||
=============
|
||||
|
||||
With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by::
|
||||
|
||||
# mount -t debugfs debugfs /sys/kernel/debug
|
||||
# cat /sys/kernel/debug/x86/pat_memtype_list
|
||||
PAT memtype list:
|
||||
uncached-minus @ 0x7fadf000-0x7fae0000
|
||||
uncached-minus @ 0x7fb19000-0x7fb1a000
|
||||
uncached-minus @ 0x7fb1a000-0x7fb1b000
|
||||
uncached-minus @ 0x7fb1b000-0x7fb1c000
|
||||
uncached-minus @ 0x7fb1c000-0x7fb1d000
|
||||
uncached-minus @ 0x7fb1d000-0x7fb1e000
|
||||
uncached-minus @ 0x7fb1e000-0x7fb25000
|
||||
uncached-minus @ 0x7fb25000-0x7fb26000
|
||||
uncached-minus @ 0x7fb26000-0x7fb27000
|
||||
uncached-minus @ 0x7fb27000-0x7fb28000
|
||||
uncached-minus @ 0x7fb28000-0x7fb2e000
|
||||
uncached-minus @ 0x7fb2e000-0x7fb2f000
|
||||
uncached-minus @ 0x7fb2f000-0x7fb30000
|
||||
uncached-minus @ 0x7fb31000-0x7fb32000
|
||||
uncached-minus @ 0x80000000-0x90000000
|
||||
|
||||
This list shows physical address ranges and various PAT settings used to
|
||||
access those physical address ranges.
|
||||
|
||||
Another, more verbose way of getting PAT related debug messages is with
|
||||
"debugpat" boot parameter. With this parameter, various debug messages are
|
||||
printed to dmesg log.
|
||||
|
||||
PAT Initialization
|
||||
==================
|
||||
|
||||
The following table describes how PAT is initialized under various
|
||||
configurations. The PAT MSR must be updated by Linux in order to support WC
|
||||
and WT attributes. Otherwise, the PAT MSR has the value programmed in it
|
||||
by the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
|
||||
|
||||
==== ===== ========================== ========= =======
|
||||
MTRR PAT Call Sequence PAT State PAT MSR
|
||||
==== ===== ========================== ========= =======
|
||||
E E MTRR -> PAT init Enabled OS
|
||||
E D MTRR -> PAT init Disabled -
|
||||
D E MTRR -> PAT disable Disabled BIOS
|
||||
D D MTRR -> PAT disable Disabled -
|
||||
- np/E PAT -> PAT disable Disabled BIOS
|
||||
- np/D PAT -> PAT disable Disabled -
|
||||
E !P/E MTRR -> PAT init Disabled BIOS
|
||||
D !P/E MTRR -> PAT disable Disabled BIOS
|
||||
!M !P/E MTRR stub -> PAT disable Disabled BIOS
|
||||
==== ===== ========================== ========= =======
|
||||
|
||||
Legend
|
||||
|
||||
========= =======================================
|
||||
E Feature enabled in CPU
|
||||
D Feature disabled/unsupported in CPU
|
||||
np "nopat" boot option specified
|
||||
!P CONFIG_X86_PAT option unset
|
||||
!M CONFIG_MTRR option unset
|
||||
Enabled PAT state set to enabled
|
||||
Disabled PAT state set to disabled
|
||||
OS PAT initializes PAT MSR with OS setting
|
||||
BIOS PAT keeps PAT MSR with BIOS setting
|
||||
========= =======================================
|
||||
|
||||
@@ -1,230 +0,0 @@
|
||||
|
||||
PAT (Page Attribute Table)
|
||||
|
||||
x86 Page Attribute Table (PAT) allows for setting the memory attribute at the
|
||||
page level granularity. PAT is complementary to the MTRR settings which allows
|
||||
for setting of memory types over physical address ranges. However, PAT is
|
||||
more flexible than MTRR due to its capability to set attributes at page level
|
||||
and also due to the fact that there are no hardware limitations on number of
|
||||
such attribute settings allowed. Added flexibility comes with guidelines for
|
||||
not having memory type aliasing for the same physical memory with multiple
|
||||
virtual addresses.
|
||||
|
||||
PAT allows for different types of memory attributes. The most commonly used
|
||||
ones that will be supported at this time are Write-back, Uncached,
|
||||
Write-combined, Write-through and Uncached Minus.
|
||||
|
||||
|
||||
PAT APIs
|
||||
--------
|
||||
|
||||
There are many different APIs in the kernel that allows setting of memory
|
||||
attributes at the page level. In order to avoid aliasing, these interfaces
|
||||
should be used thoughtfully. Below is a table of interfaces available,
|
||||
their intended usage and their memory attribute relationships. Internally,
|
||||
these APIs use a reserve_memtype()/free_memtype() interface on the physical
|
||||
address range to avoid any aliasing.
|
||||
|
||||
|
||||
-------------------------------------------------------------------
|
||||
API | RAM | ACPI,... | Reserved/Holes |
|
||||
-----------------------|----------|------------|------------------|
|
||||
| | | |
|
||||
ioremap | -- | UC- | UC- |
|
||||
| | | |
|
||||
ioremap_cache | -- | WB | WB |
|
||||
| | | |
|
||||
ioremap_uc | -- | UC | UC |
|
||||
| | | |
|
||||
ioremap_nocache | -- | UC- | UC- |
|
||||
| | | |
|
||||
ioremap_wc | -- | -- | WC |
|
||||
| | | |
|
||||
ioremap_wt | -- | -- | WT |
|
||||
| | | |
|
||||
set_memory_uc | UC- | -- | -- |
|
||||
set_memory_wb | | | |
|
||||
| | | |
|
||||
set_memory_wc | WC | -- | -- |
|
||||
set_memory_wb | | | |
|
||||
| | | |
|
||||
set_memory_wt | WT | -- | -- |
|
||||
set_memory_wb | | | |
|
||||
| | | |
|
||||
pci sysfs resource | -- | -- | UC- |
|
||||
| | | |
|
||||
pci sysfs resource_wc | -- | -- | WC |
|
||||
is IORESOURCE_PREFETCH| | | |
|
||||
| | | |
|
||||
pci proc | -- | -- | UC- |
|
||||
!PCIIOC_WRITE_COMBINE | | | |
|
||||
| | | |
|
||||
pci proc | -- | -- | WC |
|
||||
PCIIOC_WRITE_COMBINE | | | |
|
||||
| | | |
|
||||
/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
||||
read-write | | | |
|
||||
| | | |
|
||||
/dev/mem | -- | UC- | UC- |
|
||||
mmap SYNC flag | | | |
|
||||
| | | |
|
||||
/dev/mem | -- | WB/WC/UC- | WB/WC/UC- |
|
||||
mmap !SYNC flag | |(from exist-| (from exist- |
|
||||
and | | ing alias)| ing alias) |
|
||||
any alias to this area| | | |
|
||||
| | | |
|
||||
/dev/mem | -- | WB | WB |
|
||||
mmap !SYNC flag | | | |
|
||||
no alias to this area | | | |
|
||||
and | | | |
|
||||
MTRR says WB | | | |
|
||||
| | | |
|
||||
/dev/mem | -- | -- | UC- |
|
||||
mmap !SYNC flag | | | |
|
||||
no alias to this area | | | |
|
||||
and | | | |
|
||||
MTRR says !WB | | | |
|
||||
| | | |
|
||||
-------------------------------------------------------------------
|
||||
|
||||
Advanced APIs for drivers
|
||||
-------------------------
|
||||
A. Exporting pages to users with remap_pfn_range, io_remap_pfn_range,
|
||||
vmf_insert_pfn
|
||||
|
||||
Drivers wanting to export some pages to userspace do it by using mmap
|
||||
interface and a combination of
|
||||
1) pgprot_noncached()
|
||||
2) io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()
|
||||
|
||||
With PAT support, a new API pgprot_writecombine is being added. So, drivers can
|
||||
continue to use the above sequence, with either pgprot_noncached() or
|
||||
pgprot_writecombine() in step 1, followed by step 2.
|
||||
|
||||
In addition, step 2 internally tracks the region as UC or WC in memtype
|
||||
list in order to ensure no conflicting mapping.
|
||||
|
||||
Note that this set of APIs only works with IO (non RAM) regions. If driver
|
||||
wants to export a RAM region, it has to do set_memory_uc() or set_memory_wc()
|
||||
as step 0 above and also track the usage of those pages and use set_memory_wb()
|
||||
before the page is freed to free pool.
|
||||
|
||||
MTRR effects on PAT / non-PAT systems
|
||||
-------------------------------------
|
||||
|
||||
The following table provides the effects of using write-combining MTRRs when
|
||||
using ioremap*() calls on x86 for both non-PAT and PAT systems. Ideally
|
||||
mtrr_add() usage will be phased out in favor of arch_phys_wc_add() which will
|
||||
be a no-op on PAT enabled systems. The region over which a arch_phys_wc_add()
|
||||
is made, should already have been ioremapped with WC attributes or PAT entries,
|
||||
this can be done by using ioremap_wc() / set_memory_wc(). Devices which
|
||||
combine areas of IO memory desired to remain uncacheable with areas where
|
||||
write-combining is desirable should consider use of ioremap_uc() followed by
|
||||
set_memory_wc() to white-list effective write-combined areas. Such use is
|
||||
nevertheless discouraged as the effective memory type is considered
|
||||
implementation defined, yet this strategy can be used as last resort on devices
|
||||
with size-constrained regions where otherwise MTRR write-combining would
|
||||
otherwise not be effective.
|
||||
|
||||
----------------------------------------------------------------------
|
||||
MTRR Non-PAT PAT Linux ioremap value Effective memory type
|
||||
----------------------------------------------------------------------
|
||||
Non-PAT | PAT
|
||||
PAT
|
||||
|PCD
|
||||
||PWT
|
||||
|||
|
||||
WC 000 WB _PAGE_CACHE_MODE_WB WC | WC
|
||||
WC 001 WC _PAGE_CACHE_MODE_WC WC* | WC
|
||||
WC 010 UC- _PAGE_CACHE_MODE_UC_MINUS WC* | UC
|
||||
WC 011 UC _PAGE_CACHE_MODE_UC UC | UC
|
||||
----------------------------------------------------------------------
|
||||
|
||||
(*) denotes implementation defined and is discouraged
|
||||
|
||||
Notes:
|
||||
|
||||
-- in the above table mean "Not suggested usage for the API". Some of the --'s
|
||||
are strictly enforced by the kernel. Some others are not really enforced
|
||||
today, but may be enforced in future.
|
||||
|
||||
For ioremap and pci access through /sys or /proc - The actual type returned
|
||||
can be more restrictive, in case of any existing aliasing for that address.
|
||||
For example: If there is an existing uncached mapping, a new ioremap_wc can
|
||||
return uncached mapping in place of write-combine requested.
|
||||
|
||||
set_memory_[uc|wc|wt] and set_memory_wb should be used in pairs, where driver
|
||||
will first make a region uc, wc or wt and switch it back to wb after use.
|
||||
|
||||
Over time writes to /proc/mtrr will be deprecated in favor of using PAT based
|
||||
interfaces. Users writing to /proc/mtrr are suggested to use above interfaces.
|
||||
|
||||
Drivers should use ioremap_[uc|wc] to access PCI BARs with [uc|wc] access
|
||||
types.
|
||||
|
||||
Drivers should use set_memory_[uc|wc|wt] to set access type for RAM ranges.
|
||||
|
||||
|
||||
PAT debugging
|
||||
-------------
|
||||
|
||||
With CONFIG_DEBUG_FS enabled, PAT memtype list can be examined by
|
||||
|
||||
# mount -t debugfs debugfs /sys/kernel/debug
|
||||
# cat /sys/kernel/debug/x86/pat_memtype_list
|
||||
PAT memtype list:
|
||||
uncached-minus @ 0x7fadf000-0x7fae0000
|
||||
uncached-minus @ 0x7fb19000-0x7fb1a000
|
||||
uncached-minus @ 0x7fb1a000-0x7fb1b000
|
||||
uncached-minus @ 0x7fb1b000-0x7fb1c000
|
||||
uncached-minus @ 0x7fb1c000-0x7fb1d000
|
||||
uncached-minus @ 0x7fb1d000-0x7fb1e000
|
||||
uncached-minus @ 0x7fb1e000-0x7fb25000
|
||||
uncached-minus @ 0x7fb25000-0x7fb26000
|
||||
uncached-minus @ 0x7fb26000-0x7fb27000
|
||||
uncached-minus @ 0x7fb27000-0x7fb28000
|
||||
uncached-minus @ 0x7fb28000-0x7fb2e000
|
||||
uncached-minus @ 0x7fb2e000-0x7fb2f000
|
||||
uncached-minus @ 0x7fb2f000-0x7fb30000
|
||||
uncached-minus @ 0x7fb31000-0x7fb32000
|
||||
uncached-minus @ 0x80000000-0x90000000
|
||||
|
||||
This list shows physical address ranges and various PAT settings used to
|
||||
access those physical address ranges.
|
||||
|
||||
Another, more verbose way of getting PAT related debug messages is with
|
||||
"debugpat" boot parameter. With this parameter, various debug messages are
|
||||
printed to dmesg log.
|
||||
|
||||
PAT Initialization
|
||||
------------------
|
||||
|
||||
The following table describes how PAT is initialized under various
|
||||
configurations. The PAT MSR must be updated by Linux in order to support WC
|
||||
and WT attributes. Otherwise, the PAT MSR has the value programmed in it
|
||||
by the firmware. Note, Xen enables WC attribute in the PAT MSR for guests.
|
||||
|
||||
MTRR PAT Call Sequence PAT State PAT MSR
|
||||
=========================================================
|
||||
E E MTRR -> PAT init Enabled OS
|
||||
E D MTRR -> PAT init Disabled -
|
||||
D E MTRR -> PAT disable Disabled BIOS
|
||||
D D MTRR -> PAT disable Disabled -
|
||||
- np/E PAT -> PAT disable Disabled BIOS
|
||||
- np/D PAT -> PAT disable Disabled -
|
||||
E !P/E MTRR -> PAT init Disabled BIOS
|
||||
D !P/E MTRR -> PAT disable Disabled BIOS
|
||||
!M !P/E MTRR stub -> PAT disable Disabled BIOS
|
||||
|
||||
Legend
|
||||
------------------------------------------------
|
||||
E Feature enabled in CPU
|
||||
D Feature disabled/unsupported in CPU
|
||||
np "nopat" boot option specified
|
||||
!P CONFIG_X86_PAT option unset
|
||||
!M CONFIG_MTRR option unset
|
||||
Enabled PAT state set to enabled
|
||||
Disabled PAT state set to disabled
|
||||
OS PAT initializes PAT MSR with OS setting
|
||||
BIOS PAT keeps PAT MSR with BIOS setting
|
||||
|
||||
@@ -1,3 +1,9 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
======================
|
||||
Memory Protection Keys
|
||||
======================
|
||||
|
||||
Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
|
||||
which is found on Intel's Skylake "Scalable Processor" Server CPUs.
|
||||
It will be avalable in future non-server parts.
|
||||
@@ -23,9 +29,10 @@ even though there is theoretically space in the PAE PTEs. These
|
||||
permissions are enforced on data access only and have no effect on
|
||||
instruction fetches.
|
||||
|
||||
=========================== Syscalls ===========================
|
||||
Syscalls
|
||||
========
|
||||
|
||||
There are 3 system calls which directly interact with pkeys:
|
||||
There are 3 system calls which directly interact with pkeys::
|
||||
|
||||
int pkey_alloc(unsigned long flags, unsigned long init_access_rights)
|
||||
int pkey_free(int pkey);
|
||||
@@ -37,6 +44,7 @@ pkey_alloc(). An application calls the WRPKRU instruction
|
||||
directly in order to change access permissions to memory covered
|
||||
with a key. In this example WRPKRU is wrapped by a C function
|
||||
called pkey_set().
|
||||
::
|
||||
|
||||
int real_prot = PROT_READ|PROT_WRITE;
|
||||
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
|
||||
@@ -45,43 +53,44 @@ called pkey_set().
|
||||
... application runs here
|
||||
|
||||
Now, if the application needs to update the data at 'ptr', it can
|
||||
gain access, do the update, then remove its write access:
|
||||
gain access, do the update, then remove its write access::
|
||||
|
||||
pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
|
||||
*ptr = foo; // assign something
|
||||
pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again
|
||||
|
||||
Now when it frees the memory, it will also free the pkey since it
|
||||
is no longer in use:
|
||||
is no longer in use::
|
||||
|
||||
munmap(ptr, PAGE_SIZE);
|
||||
pkey_free(pkey);
|
||||
|
||||
(Note: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
|
||||
An example implementation can be found in
|
||||
tools/testing/selftests/x86/protection_keys.c)
|
||||
.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
|
||||
An example implementation can be found in
|
||||
tools/testing/selftests/x86/protection_keys.c.
|
||||
|
||||
=========================== Behavior ===========================
|
||||
Behavior
|
||||
========
|
||||
|
||||
The kernel attempts to make protection keys consistent with the
|
||||
behavior of a plain mprotect(). For instance if you do this:
|
||||
behavior of a plain mprotect(). For instance if you do this::
|
||||
|
||||
mprotect(ptr, size, PROT_NONE);
|
||||
something(ptr);
|
||||
|
||||
you can expect the same effects with protection keys when doing this:
|
||||
you can expect the same effects with protection keys when doing this::
|
||||
|
||||
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ);
|
||||
pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey);
|
||||
something(ptr);
|
||||
|
||||
That should be true whether something() is a direct access to 'ptr'
|
||||
like:
|
||||
like::
|
||||
|
||||
*ptr = foo;
|
||||
|
||||
or when the kernel does the access on the application's behalf like
|
||||
with a read():
|
||||
with a read()::
|
||||
|
||||
read(fd, ptr, 1);
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user