You've already forked linux-apfs
mirror of
https://github.com/linux-apfs/linux-apfs.git
synced 2026-05-01 15:00:59 -07:00
Merge tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Radim Krčmář:
"ARM:
- icache invalidation optimizations, improving VM startup time
- support for forwarded level-triggered interrupts, improving
performance for timers and passthrough platform devices
- a small fix for power-management notifiers, and some cosmetic
changes
PPC:
- add MMIO emulation for vector loads and stores
- allow HPT guests to run on a radix host on POWER9 v2.2 CPUs without
requiring the complex thread synchronization of older CPU versions
- improve the handling of escalation interrupts with the XIVE
interrupt controller
- support decrement register migration
- various cleanups and bugfixes.
s390:
- Cornelia Huck passed maintainership to Janosch Frank
- exitless interrupts for emulated devices
- cleanup of cpuflag handling
- kvm_stat counter improvements
- VSIE improvements
- mm cleanup
x86:
- hypervisor part of SEV
- UMIP, RDPID, and MSR_SMI_COUNT emulation
- paravirtualized TLB shootdown using the new KVM_VCPU_PREEMPTED bit
- allow guests to see TOPOEXT, GFNI, VAES, VPCLMULQDQ, and more
AVX512 features
- show vcpu id in its anonymous inode name
- many fixes and cleanups
- per-VCPU MSR bitmaps (already merged through x86/pti branch)
- stable KVM clock when nesting on Hyper-V (merged through
x86/hyperv)"
* tag 'kvm-4.16-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (197 commits)
KVM: PPC: Book3S: Add MMIO emulation for VMX instructions
KVM: PPC: Book3S HV: Branch inside feature section
KVM: PPC: Book3S HV: Make HPT resizing work on POWER9
KVM: PPC: Book3S HV: Fix handling of secondary HPTEG in HPT resizing code
KVM: PPC: Book3S PR: Fix broken select due to misspelling
KVM: x86: don't forget vcpu_put() in kvm_arch_vcpu_ioctl_set_sregs()
KVM: PPC: Book3S PR: Fix svcpu copying with preemption enabled
KVM: PPC: Book3S HV: Drop locks before reading guest memory
kvm: x86: remove efer_reload entry in kvm_vcpu_stat
KVM: x86: AMD Processor Topology Information
x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested
kvm: embed vcpu id to dentry of vcpu anon inode
kvm: Map PFN-type memory regions as writable (if possible)
x86/kvm: Make it compile on 32bit and with HYPYERVISOR_GUEST=n
KVM: arm/arm64: Fixup userspace irqchip static key optimization
KVM: arm/arm64: Fix userspace_irqchip_in_use counting
KVM: arm/arm64: Fix incorrect timer_is_pending logic
MAINTAINERS: update KVM/s390 maintainers
MAINTAINERS: add Halil as additional vfio-ccw maintainer
MAINTAINERS: add David as a reviewer for KVM/s390
...
This commit is contained in:
@@ -26,3 +26,6 @@ s390-diag.txt
|
|||||||
- Diagnose hypercall description (for IBM S/390)
|
- Diagnose hypercall description (for IBM S/390)
|
||||||
timekeeping.txt
|
timekeeping.txt
|
||||||
- timekeeping virtualization for x86-based architectures.
|
- timekeeping virtualization for x86-based architectures.
|
||||||
|
amd-memory-encryption.txt
|
||||||
|
- notes on AMD Secure Encrypted Virtualization feature and SEV firmware
|
||||||
|
command description
|
||||||
|
|||||||
@@ -0,0 +1,247 @@
|
|||||||
|
======================================
|
||||||
|
Secure Encrypted Virtualization (SEV)
|
||||||
|
======================================
|
||||||
|
|
||||||
|
Overview
|
||||||
|
========
|
||||||
|
|
||||||
|
Secure Encrypted Virtualization (SEV) is a feature found on AMD processors.
|
||||||
|
|
||||||
|
SEV is an extension to the AMD-V architecture which supports running
|
||||||
|
virtual machines (VMs) under the control of a hypervisor. When enabled,
|
||||||
|
the memory contents of a VM will be transparently encrypted with a key
|
||||||
|
unique to that VM.
|
||||||
|
|
||||||
|
The hypervisor can determine the SEV support through the CPUID
|
||||||
|
instruction. The CPUID function 0x8000001f reports information related
|
||||||
|
to SEV::
|
||||||
|
|
||||||
|
0x8000001f[eax]:
|
||||||
|
Bit[1] indicates support for SEV
|
||||||
|
...
|
||||||
|
[ecx]:
|
||||||
|
Bits[31:0] Number of encrypted guests supported simultaneously
|
||||||
|
|
||||||
|
If support for SEV is present, MSR 0xc001_0010 (MSR_K8_SYSCFG) and MSR 0xc001_0015
|
||||||
|
(MSR_K7_HWCR) can be used to determine if it can be enabled::
|
||||||
|
|
||||||
|
0xc001_0010:
|
||||||
|
Bit[23] 1 = memory encryption can be enabled
|
||||||
|
0 = memory encryption can not be enabled
|
||||||
|
|
||||||
|
0xc001_0015:
|
||||||
|
Bit[0] 1 = memory encryption can be enabled
|
||||||
|
0 = memory encryption can not be enabled
|
||||||
|
|
||||||
|
When SEV support is available, it can be enabled in a specific VM by
|
||||||
|
setting the SEV bit before executing VMRUN.::
|
||||||
|
|
||||||
|
VMCB[0x90]:
|
||||||
|
Bit[1] 1 = SEV is enabled
|
||||||
|
0 = SEV is disabled
|
||||||
|
|
||||||
|
SEV hardware uses ASIDs to associate a memory encryption key with a VM.
|
||||||
|
Hence, the ASID for the SEV-enabled guests must be from 1 to a maximum value
|
||||||
|
defined in the CPUID 0x8000001f[ecx] field.
|
||||||
|
|
||||||
|
SEV Key Management
|
||||||
|
==================
|
||||||
|
|
||||||
|
The SEV guest key management is handled by a separate processor called the AMD
|
||||||
|
Secure Processor (AMD-SP). Firmware running inside the AMD-SP provides a secure
|
||||||
|
key management interface to perform common hypervisor activities such as
|
||||||
|
encrypting bootstrap code, snapshot, migrating and debugging the guest. For more
|
||||||
|
information, see the SEV Key Management spec [api-spec]_
|
||||||
|
|
||||||
|
KVM implements the following commands to support common lifecycle events of SEV
|
||||||
|
guests, such as launching, running, snapshotting, migrating and decommissioning.
|
||||||
|
|
||||||
|
1. KVM_SEV_INIT
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The KVM_SEV_INIT command is used by the hypervisor to initialize the SEV platform
|
||||||
|
context. In a typical workflow, this command should be the first command issued.
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
2. KVM_SEV_LAUNCH_START
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
The KVM_SEV_LAUNCH_START command is used for creating the memory encryption
|
||||||
|
context. To create the encryption context, user must provide a guest policy,
|
||||||
|
the owner's public Diffie-Hellman (PDH) key and session information.
|
||||||
|
|
||||||
|
Parameters: struct kvm_sev_launch_start (in/out)
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_launch_start {
|
||||||
|
__u32 handle; /* if zero then firmware creates a new handle */
|
||||||
|
__u32 policy; /* guest's policy */
|
||||||
|
|
||||||
|
__u64 dh_uaddr; /* userspace address pointing to the guest owner's PDH key */
|
||||||
|
__u32 dh_len;
|
||||||
|
|
||||||
|
__u64 session_addr; /* userspace address which points to the guest session information */
|
||||||
|
__u32 session_len;
|
||||||
|
};
|
||||||
|
|
||||||
|
On success, the 'handle' field contains a new handle and on error, a negative value.
|
||||||
|
|
||||||
|
For more details, see SEV spec Section 6.2.
|
||||||
|
|
||||||
|
3. KVM_SEV_LAUNCH_UPDATE_DATA
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
The KVM_SEV_LAUNCH_UPDATE_DATA is used for encrypting a memory region. It also
|
||||||
|
calculates a measurement of the memory contents. The measurement is a signature
|
||||||
|
of the memory contents that can be sent to the guest owner as an attestation
|
||||||
|
that the memory was encrypted correctly by the firmware.
|
||||||
|
|
||||||
|
Parameters (in): struct kvm_sev_launch_update_data
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_launch_update {
|
||||||
|
__u64 uaddr; /* userspace address to be encrypted (must be 16-byte aligned) */
|
||||||
|
__u32 len; /* length of the data to be encrypted (must be 16-byte aligned) */
|
||||||
|
};
|
||||||
|
|
||||||
|
For more details, see SEV spec Section 6.3.
|
||||||
|
|
||||||
|
4. KVM_SEV_LAUNCH_MEASURE
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
The KVM_SEV_LAUNCH_MEASURE command is used to retrieve the measurement of the
|
||||||
|
data encrypted by the KVM_SEV_LAUNCH_UPDATE_DATA command. The guest owner may
|
||||||
|
wait to provide the guest with confidential information until it can verify the
|
||||||
|
measurement. Since the guest owner knows the initial contents of the guest at
|
||||||
|
boot, the measurement can be verified by comparing it to what the guest owner
|
||||||
|
expects.
|
||||||
|
|
||||||
|
Parameters (in): struct kvm_sev_launch_measure
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_launch_measure {
|
||||||
|
__u64 uaddr; /* where to copy the measurement */
|
||||||
|
__u32 len; /* length of measurement blob */
|
||||||
|
};
|
||||||
|
|
||||||
|
For more details on the measurement verification flow, see SEV spec Section 6.4.
|
||||||
|
|
||||||
|
5. KVM_SEV_LAUNCH_FINISH
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
After completion of the launch flow, the KVM_SEV_LAUNCH_FINISH command can be
|
||||||
|
issued to make the guest ready for the execution.
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
6. KVM_SEV_GUEST_STATUS
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
The KVM_SEV_GUEST_STATUS command is used to retrieve status information about a
|
||||||
|
SEV-enabled guest.
|
||||||
|
|
||||||
|
Parameters (out): struct kvm_sev_guest_status
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_guest_status {
|
||||||
|
__u32 handle; /* guest handle */
|
||||||
|
__u32 policy; /* guest policy */
|
||||||
|
__u8 state; /* guest state (see enum below) */
|
||||||
|
};
|
||||||
|
|
||||||
|
SEV guest state:
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
enum {
|
||||||
|
SEV_STATE_INVALID = 0;
|
||||||
|
SEV_STATE_LAUNCHING, /* guest is currently being launched */
|
||||||
|
SEV_STATE_SECRET, /* guest is being launched and ready to accept the ciphertext data */
|
||||||
|
SEV_STATE_RUNNING, /* guest is fully launched and running */
|
||||||
|
SEV_STATE_RECEIVING, /* guest is being migrated in from another SEV machine */
|
||||||
|
SEV_STATE_SENDING /* guest is getting migrated out to another SEV machine */
|
||||||
|
};
|
||||||
|
|
||||||
|
7. KVM_SEV_DBG_DECRYPT
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
The KVM_SEV_DEBUG_DECRYPT command can be used by the hypervisor to request the
|
||||||
|
firmware to decrypt the data at the given memory region.
|
||||||
|
|
||||||
|
Parameters (in): struct kvm_sev_dbg
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_dbg {
|
||||||
|
__u64 src_uaddr; /* userspace address of data to decrypt */
|
||||||
|
__u64 dst_uaddr; /* userspace address of destination */
|
||||||
|
__u32 len; /* length of memory region to decrypt */
|
||||||
|
};
|
||||||
|
|
||||||
|
The command returns an error if the guest policy does not allow debugging.
|
||||||
|
|
||||||
|
8. KVM_SEV_DBG_ENCRYPT
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
The KVM_SEV_DEBUG_ENCRYPT command can be used by the hypervisor to request the
|
||||||
|
firmware to encrypt the data at the given memory region.
|
||||||
|
|
||||||
|
Parameters (in): struct kvm_sev_dbg
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_dbg {
|
||||||
|
__u64 src_uaddr; /* userspace address of data to encrypt */
|
||||||
|
__u64 dst_uaddr; /* userspace address of destination */
|
||||||
|
__u32 len; /* length of memory region to encrypt */
|
||||||
|
};
|
||||||
|
|
||||||
|
The command returns an error if the guest policy does not allow debugging.
|
||||||
|
|
||||||
|
9. KVM_SEV_LAUNCH_SECRET
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
The KVM_SEV_LAUNCH_SECRET command can be used by the hypervisor to inject secret
|
||||||
|
data after the measurement has been validated by the guest owner.
|
||||||
|
|
||||||
|
Parameters (in): struct kvm_sev_launch_secret
|
||||||
|
|
||||||
|
Returns: 0 on success, -negative on error
|
||||||
|
|
||||||
|
::
|
||||||
|
|
||||||
|
struct kvm_sev_launch_secret {
|
||||||
|
__u64 hdr_uaddr; /* userspace address containing the packet header */
|
||||||
|
__u32 hdr_len;
|
||||||
|
|
||||||
|
__u64 guest_uaddr; /* the guest memory region where the secret should be injected */
|
||||||
|
__u32 guest_len;
|
||||||
|
|
||||||
|
__u64 trans_uaddr; /* the hypervisor memory region which contains the secret */
|
||||||
|
__u32 trans_len;
|
||||||
|
};
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [white-paper] http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2013/12/AMD_Memory_Encryption_Whitepaper_v7-Public.pdf
|
||||||
|
.. [api-spec] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf
|
||||||
|
.. [amd-apm] http://support.amd.com/TechDocs/24593.pdf (section 15.34)
|
||||||
|
.. [kvm-forum] http://www.linux-kvm.org/images/7/74/02x08A-Thomas_Lendacky-AMDs_Virtualizatoin_Memory_Encryption_Technology.pdf
|
||||||
@@ -1841,6 +1841,7 @@ registers, find a list below:
|
|||||||
PPC | KVM_REG_PPC_DBSR | 32
|
PPC | KVM_REG_PPC_DBSR | 32
|
||||||
PPC | KVM_REG_PPC_TIDR | 64
|
PPC | KVM_REG_PPC_TIDR | 64
|
||||||
PPC | KVM_REG_PPC_PSSCR | 64
|
PPC | KVM_REG_PPC_PSSCR | 64
|
||||||
|
PPC | KVM_REG_PPC_DEC_EXPIRY | 64
|
||||||
PPC | KVM_REG_PPC_TM_GPR0 | 64
|
PPC | KVM_REG_PPC_TM_GPR0 | 64
|
||||||
...
|
...
|
||||||
PPC | KVM_REG_PPC_TM_GPR31 | 64
|
PPC | KVM_REG_PPC_TM_GPR31 | 64
|
||||||
@@ -3403,7 +3404,7 @@ invalid, if invalid pages are written to (e.g. after the end of memory)
|
|||||||
or if no page table is present for the addresses (e.g. when using
|
or if no page table is present for the addresses (e.g. when using
|
||||||
hugepages).
|
hugepages).
|
||||||
|
|
||||||
4.108 KVM_PPC_GET_CPU_CHAR
|
4.109 KVM_PPC_GET_CPU_CHAR
|
||||||
|
|
||||||
Capability: KVM_CAP_PPC_GET_CPU_CHAR
|
Capability: KVM_CAP_PPC_GET_CPU_CHAR
|
||||||
Architectures: powerpc
|
Architectures: powerpc
|
||||||
@@ -3449,6 +3450,57 @@ array bounds check and the array access.
|
|||||||
These fields use the same bit definitions as the new
|
These fields use the same bit definitions as the new
|
||||||
H_GET_CPU_CHARACTERISTICS hypercall.
|
H_GET_CPU_CHARACTERISTICS hypercall.
|
||||||
|
|
||||||
|
4.110 KVM_MEMORY_ENCRYPT_OP
|
||||||
|
|
||||||
|
Capability: basic
|
||||||
|
Architectures: x86
|
||||||
|
Type: system
|
||||||
|
Parameters: an opaque platform specific structure (in/out)
|
||||||
|
Returns: 0 on success; -1 on error
|
||||||
|
|
||||||
|
If the platform supports creating encrypted VMs then this ioctl can be used
|
||||||
|
for issuing platform-specific memory encryption commands to manage those
|
||||||
|
encrypted VMs.
|
||||||
|
|
||||||
|
Currently, this ioctl is used for issuing Secure Encrypted Virtualization
|
||||||
|
(SEV) commands on AMD Processors. The SEV commands are defined in
|
||||||
|
Documentation/virtual/kvm/amd-memory-encryption.txt.
|
||||||
|
|
||||||
|
4.111 KVM_MEMORY_ENCRYPT_REG_REGION
|
||||||
|
|
||||||
|
Capability: basic
|
||||||
|
Architectures: x86
|
||||||
|
Type: system
|
||||||
|
Parameters: struct kvm_enc_region (in)
|
||||||
|
Returns: 0 on success; -1 on error
|
||||||
|
|
||||||
|
This ioctl can be used to register a guest memory region which may
|
||||||
|
contain encrypted data (e.g. guest RAM, SMRAM etc).
|
||||||
|
|
||||||
|
It is used in the SEV-enabled guest. When encryption is enabled, a guest
|
||||||
|
memory region may contain encrypted data. The SEV memory encryption
|
||||||
|
engine uses a tweak such that two identical plaintext pages, each at
|
||||||
|
different locations will have differing ciphertexts. So swapping or
|
||||||
|
moving ciphertext of those pages will not result in plaintext being
|
||||||
|
swapped. So relocating (or migrating) physical backing pages for the SEV
|
||||||
|
guest will require some additional steps.
|
||||||
|
|
||||||
|
Note: The current SEV key management spec does not provide commands to
|
||||||
|
swap or migrate (move) ciphertext pages. Hence, for now we pin the guest
|
||||||
|
memory region registered with the ioctl.
|
||||||
|
|
||||||
|
4.112 KVM_MEMORY_ENCRYPT_UNREG_REGION
|
||||||
|
|
||||||
|
Capability: basic
|
||||||
|
Architectures: x86
|
||||||
|
Type: system
|
||||||
|
Parameters: struct kvm_enc_region (in)
|
||||||
|
Returns: 0 on success; -1 on error
|
||||||
|
|
||||||
|
This ioctl can be used to unregister the guest memory region registered
|
||||||
|
with KVM_MEMORY_ENCRYPT_REG_REGION ioctl above.
|
||||||
|
|
||||||
|
|
||||||
5. The kvm_run structure
|
5. The kvm_run structure
|
||||||
------------------------
|
------------------------
|
||||||
|
|
||||||
|
|||||||
@@ -1,187 +0,0 @@
|
|||||||
KVM/ARM VGIC Forwarded Physical Interrupts
|
|
||||||
==========================================
|
|
||||||
|
|
||||||
The KVM/ARM code implements software support for the ARM Generic
|
|
||||||
Interrupt Controller's (GIC's) hardware support for virtualization by
|
|
||||||
allowing software to inject virtual interrupts to a VM, which the guest
|
|
||||||
OS sees as regular interrupts. The code is famously known as the VGIC.
|
|
||||||
|
|
||||||
Some of these virtual interrupts, however, correspond to physical
|
|
||||||
interrupts from real physical devices. One example could be the
|
|
||||||
architected timer, which itself supports virtualization, and therefore
|
|
||||||
lets a guest OS program the hardware device directly to raise an
|
|
||||||
interrupt at some point in time. When such an interrupt is raised, the
|
|
||||||
host OS initially handles the interrupt and must somehow signal this
|
|
||||||
event as a virtual interrupt to the guest. Another example could be a
|
|
||||||
passthrough device, where the physical interrupts are initially handled
|
|
||||||
by the host, but the device driver for the device lives in the guest OS
|
|
||||||
and KVM must therefore somehow inject a virtual interrupt on behalf of
|
|
||||||
the physical one to the guest OS.
|
|
||||||
|
|
||||||
These virtual interrupts corresponding to a physical interrupt on the
|
|
||||||
host are called forwarded physical interrupts, but are also sometimes
|
|
||||||
referred to as 'virtualized physical interrupts' and 'mapped interrupts'.
|
|
||||||
|
|
||||||
Forwarded physical interrupts are handled slightly differently compared
|
|
||||||
to virtual interrupts generated purely by a software emulated device.
|
|
||||||
|
|
||||||
|
|
||||||
The HW bit
|
|
||||||
----------
|
|
||||||
Virtual interrupts are signalled to the guest by programming the List
|
|
||||||
Registers (LRs) on the GIC before running a VCPU. The LR is programmed
|
|
||||||
with the virtual IRQ number and the state of the interrupt (Pending,
|
|
||||||
Active, or Pending+Active). When the guest ACKs and EOIs a virtual
|
|
||||||
interrupt, the LR state moves from Pending to Active, and finally to
|
|
||||||
inactive.
|
|
||||||
|
|
||||||
The LRs include an extra bit, called the HW bit. When this bit is set,
|
|
||||||
KVM must also program an additional field in the LR, the physical IRQ
|
|
||||||
number, to link the virtual with the physical IRQ.
|
|
||||||
|
|
||||||
When the HW bit is set, KVM must EITHER set the Pending OR the Active
|
|
||||||
bit, never both at the same time.
|
|
||||||
|
|
||||||
Setting the HW bit causes the hardware to deactivate the physical
|
|
||||||
interrupt on the physical distributor when the guest deactivates the
|
|
||||||
corresponding virtual interrupt.
|
|
||||||
|
|
||||||
|
|
||||||
Forwarded Physical Interrupts Life Cycle
|
|
||||||
----------------------------------------
|
|
||||||
|
|
||||||
The state of forwarded physical interrupts is managed in the following way:
|
|
||||||
|
|
||||||
- The physical interrupt is acked by the host, and becomes active on
|
|
||||||
the physical distributor (*).
|
|
||||||
- KVM sets the LR.Pending bit, because this is the only way the GICV
|
|
||||||
interface is going to present it to the guest.
|
|
||||||
- LR.Pending will stay set as long as the guest has not acked the interrupt.
|
|
||||||
- LR.Pending transitions to LR.Active on the guest read of the IAR, as
|
|
||||||
expected.
|
|
||||||
- On guest EOI, the *physical distributor* active bit gets cleared,
|
|
||||||
but the LR.Active is left untouched (set).
|
|
||||||
- KVM clears the LR on VM exits when the physical distributor
|
|
||||||
active state has been cleared.
|
|
||||||
|
|
||||||
(*): The host handling is slightly more complicated. For some forwarded
|
|
||||||
interrupts (shared), KVM directly sets the active state on the physical
|
|
||||||
distributor before entering the guest, because the interrupt is never actually
|
|
||||||
handled on the host (see details on the timer as an example below). For other
|
|
||||||
forwarded interrupts (non-shared) the host does not deactivate the interrupt
|
|
||||||
when the host ISR completes, but leaves the interrupt active until the guest
|
|
||||||
deactivates it. Leaving the interrupt active is allowed, because Linux
|
|
||||||
configures the physical GIC with EOIMode=1, which causes EOI operations to
|
|
||||||
perform a priority drop allowing the GIC to receive other interrupts of the
|
|
||||||
default priority.
|
|
||||||
|
|
||||||
|
|
||||||
Forwarded Edge and Level Triggered PPIs and SPIs
|
|
||||||
------------------------------------------------
|
|
||||||
Forwarded physical interrupts injected should always be active on the
|
|
||||||
physical distributor when injected to a guest.
|
|
||||||
|
|
||||||
Level-triggered interrupts will keep the interrupt line to the GIC
|
|
||||||
asserted, typically until the guest programs the device to deassert the
|
|
||||||
line. This means that the interrupt will remain pending on the physical
|
|
||||||
distributor until the guest has reprogrammed the device. Since we
|
|
||||||
always run the VM with interrupts enabled on the CPU, a pending
|
|
||||||
interrupt will exit the guest as soon as we switch into the guest,
|
|
||||||
preventing the guest from ever making progress as the process repeats
|
|
||||||
over and over. Therefore, the active state on the physical distributor
|
|
||||||
must be set when entering the guest, preventing the GIC from forwarding
|
|
||||||
the pending interrupt to the CPU. As soon as the guest deactivates the
|
|
||||||
interrupt, the physical line is sampled by the hardware again and the host
|
|
||||||
takes a new interrupt if and only if the physical line is still asserted.
|
|
||||||
|
|
||||||
Edge-triggered interrupts do not exhibit the same problem with
|
|
||||||
preventing guest execution that level-triggered interrupts do. One
|
|
||||||
option is to not use HW bit at all, and inject edge-triggered interrupts
|
|
||||||
from a physical device as pure virtual interrupts. But that would
|
|
||||||
potentially slow down handling of the interrupt in the guest, because a
|
|
||||||
physical interrupt occurring in the middle of the guest ISR would
|
|
||||||
preempt the guest for the host to handle the interrupt. Additionally,
|
|
||||||
if you configure the system to handle interrupts on a separate physical
|
|
||||||
core from that running your VCPU, you still have to interrupt the VCPU
|
|
||||||
to queue the pending state onto the LR, even though the guest won't use
|
|
||||||
this information until the guest ISR completes. Therefore, the HW
|
|
||||||
bit should always be set for forwarded edge-triggered interrupts. With
|
|
||||||
the HW bit set, the virtual interrupt is injected and additional
|
|
||||||
physical interrupts occurring before the guest deactivates the interrupt
|
|
||||||
simply mark the state on the physical distributor as Pending+Active. As
|
|
||||||
soon as the guest deactivates the interrupt, the host takes another
|
|
||||||
interrupt if and only if there was a physical interrupt between injecting
|
|
||||||
the forwarded interrupt to the guest and the guest deactivating the
|
|
||||||
interrupt.
|
|
||||||
|
|
||||||
Consequently, whenever we schedule a VCPU with one or more LRs with the
|
|
||||||
HW bit set, the interrupt must also be active on the physical
|
|
||||||
distributor.
|
|
||||||
|
|
||||||
|
|
||||||
Forwarded LPIs
|
|
||||||
--------------
|
|
||||||
LPIs, introduced in GICv3, are always edge-triggered and do not have an
|
|
||||||
active state. They become pending when a device signal them, and as
|
|
||||||
soon as they are acked by the CPU, they are inactive again.
|
|
||||||
|
|
||||||
It therefore doesn't make sense, and is not supported, to set the HW bit
|
|
||||||
for physical LPIs that are forwarded to a VM as virtual interrupts,
|
|
||||||
typically virtual SPIs.
|
|
||||||
|
|
||||||
For LPIs, there is no other choice than to preempt the VCPU thread if
|
|
||||||
necessary, and queue the pending state onto the LR.
|
|
||||||
|
|
||||||
|
|
||||||
Putting It Together: The Architected Timer
|
|
||||||
------------------------------------------
|
|
||||||
The architected timer is a device that signals interrupts with level
|
|
||||||
triggered semantics. The timer hardware is directly accessed by VCPUs
|
|
||||||
which program the timer to fire at some point in time. Each VCPU on a
|
|
||||||
system programs the timer to fire at different times, and therefore the
|
|
||||||
hardware is multiplexed between multiple VCPUs. This is implemented by
|
|
||||||
context-switching the timer state along with each VCPU thread.
|
|
||||||
|
|
||||||
However, this means that a scenario like the following is entirely
|
|
||||||
possible, and in fact, typical:
|
|
||||||
|
|
||||||
1. KVM runs the VCPU
|
|
||||||
2. The guest programs the time to fire in T+100
|
|
||||||
3. The guest is idle and calls WFI (wait-for-interrupts)
|
|
||||||
4. The hardware traps to the host
|
|
||||||
5. KVM stores the timer state to memory and disables the hardware timer
|
|
||||||
6. KVM schedules a soft timer to fire in T+(100 - time since step 2)
|
|
||||||
7. KVM puts the VCPU thread to sleep (on a waitqueue)
|
|
||||||
8. The soft timer fires, waking up the VCPU thread
|
|
||||||
9. KVM reprograms the timer hardware with the VCPU's values
|
|
||||||
10. KVM marks the timer interrupt as active on the physical distributor
|
|
||||||
11. KVM injects a forwarded physical interrupt to the guest
|
|
||||||
12. KVM runs the VCPU
|
|
||||||
|
|
||||||
Notice that KVM injects a forwarded physical interrupt in step 11 without
|
|
||||||
the corresponding interrupt having actually fired on the host. That is
|
|
||||||
exactly why we mark the timer interrupt as active in step 10, because
|
|
||||||
the active state on the physical distributor is part of the state
|
|
||||||
belonging to the timer hardware, which is context-switched along with
|
|
||||||
the VCPU thread.
|
|
||||||
|
|
||||||
If the guest does not idle because it is busy, the flow looks like this
|
|
||||||
instead:
|
|
||||||
|
|
||||||
1. KVM runs the VCPU
|
|
||||||
2. The guest programs the time to fire in T+100
|
|
||||||
4. At T+100 the timer fires and a physical IRQ causes the VM to exit
|
|
||||||
(note that this initially only traps to EL2 and does not run the host ISR
|
|
||||||
until KVM has returned to the host).
|
|
||||||
5. With interrupts still disabled on the CPU coming back from the guest, KVM
|
|
||||||
stores the virtual timer state to memory and disables the virtual hw timer.
|
|
||||||
6. KVM looks at the timer state (in memory) and injects a forwarded physical
|
|
||||||
interrupt because it concludes the timer has expired.
|
|
||||||
7. KVM marks the timer interrupt as active on the physical distributor
|
|
||||||
7. KVM enables the timer, enables interrupts, and runs the VCPU
|
|
||||||
|
|
||||||
Notice that again the forwarded physical interrupt is injected to the
|
|
||||||
guest without having actually been handled on the host. In this case it
|
|
||||||
is because the physical interrupt is never actually seen by the host because the
|
|
||||||
timer is disabled upon guest return, and the virtual forwarded interrupt is
|
|
||||||
injected on the KVM guest entry path.
|
|
||||||
@@ -54,6 +54,10 @@ KVM_FEATURE_PV_UNHALT || 7 || guest checks this feature bit
|
|||||||
|| || before enabling paravirtualized
|
|| || before enabling paravirtualized
|
||||||
|| || spinlock support.
|
|| || spinlock support.
|
||||||
------------------------------------------------------------------------------
|
------------------------------------------------------------------------------
|
||||||
|
KVM_FEATURE_PV_TLB_FLUSH || 9 || guest checks this feature bit
|
||||||
|
|| || before enabling paravirtualized
|
||||||
|
|| || tlb flush.
|
||||||
|
------------------------------------------------------------------------------
|
||||||
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|
||||||
|| || per-cpu warps are expected in
|
|| || per-cpu warps are expected in
|
||||||
|| || kvmclock.
|
|| || kvmclock.
|
||||||
|
|||||||
+4
-1
@@ -7748,7 +7748,9 @@ F: arch/powerpc/kernel/kvm*
|
|||||||
|
|
||||||
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
|
KERNEL VIRTUAL MACHINE for s390 (KVM/s390)
|
||||||
M: Christian Borntraeger <borntraeger@de.ibm.com>
|
M: Christian Borntraeger <borntraeger@de.ibm.com>
|
||||||
M: Cornelia Huck <cohuck@redhat.com>
|
M: Janosch Frank <frankja@linux.vnet.ibm.com>
|
||||||
|
R: David Hildenbrand <david@redhat.com>
|
||||||
|
R: Cornelia Huck <cohuck@redhat.com>
|
||||||
L: linux-s390@vger.kernel.org
|
L: linux-s390@vger.kernel.org
|
||||||
W: http://www.ibm.com/developerworks/linux/linux390/
|
W: http://www.ibm.com/developerworks/linux/linux390/
|
||||||
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
|
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
|
||||||
@@ -12026,6 +12028,7 @@ F: drivers/pci/hotplug/s390_pci_hpc.c
|
|||||||
S390 VFIO-CCW DRIVER
|
S390 VFIO-CCW DRIVER
|
||||||
M: Cornelia Huck <cohuck@redhat.com>
|
M: Cornelia Huck <cohuck@redhat.com>
|
||||||
M: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
|
M: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
|
||||||
|
M: Halil Pasic <pasic@linux.vnet.ibm.com>
|
||||||
L: linux-s390@vger.kernel.org
|
L: linux-s390@vger.kernel.org
|
||||||
L: kvm@vger.kernel.org
|
L: kvm@vger.kernel.org
|
||||||
S: Supported
|
S: Supported
|
||||||
|
|||||||
@@ -131,7 +131,7 @@ static inline bool mode_has_spsr(struct kvm_vcpu *vcpu)
|
|||||||
static inline bool vcpu_mode_priv(struct kvm_vcpu *vcpu)
|
static inline bool vcpu_mode_priv(struct kvm_vcpu *vcpu)
|
||||||
{
|
{
|
||||||
unsigned long cpsr_mode = vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr & MODE_MASK;
|
unsigned long cpsr_mode = vcpu->arch.ctxt.gp_regs.usr_regs.ARM_cpsr & MODE_MASK;
|
||||||
return cpsr_mode > USR_MODE;;
|
return cpsr_mode > USR_MODE;
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
|
static inline u32 kvm_vcpu_get_hsr(const struct kvm_vcpu *vcpu)
|
||||||
|
|||||||
@@ -48,6 +48,8 @@
|
|||||||
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||||
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
|
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
|
||||||
|
|
||||||
|
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
|
||||||
|
|
||||||
u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
|
u32 *kvm_vcpu_reg(struct kvm_vcpu *vcpu, u8 reg_num, u32 mode);
|
||||||
int __attribute_const__ kvm_target_cpu(void);
|
int __attribute_const__ kvm_target_cpu(void);
|
||||||
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
||||||
|
|||||||
@@ -21,7 +21,6 @@
|
|||||||
#include <linux/compiler.h>
|
#include <linux/compiler.h>
|
||||||
#include <linux/kvm_host.h>
|
#include <linux/kvm_host.h>
|
||||||
#include <asm/cp15.h>
|
#include <asm/cp15.h>
|
||||||
#include <asm/kvm_mmu.h>
|
|
||||||
#include <asm/vfp.h>
|
#include <asm/vfp.h>
|
||||||
|
|
||||||
#define __hyp_text __section(.hyp.text) notrace
|
#define __hyp_text __section(.hyp.text) notrace
|
||||||
@@ -69,6 +68,8 @@
|
|||||||
#define HIFAR __ACCESS_CP15(c6, 4, c0, 2)
|
#define HIFAR __ACCESS_CP15(c6, 4, c0, 2)
|
||||||
#define HPFAR __ACCESS_CP15(c6, 4, c0, 4)
|
#define HPFAR __ACCESS_CP15(c6, 4, c0, 4)
|
||||||
#define ICIALLUIS __ACCESS_CP15(c7, 0, c1, 0)
|
#define ICIALLUIS __ACCESS_CP15(c7, 0, c1, 0)
|
||||||
|
#define BPIALLIS __ACCESS_CP15(c7, 0, c1, 6)
|
||||||
|
#define ICIMVAU __ACCESS_CP15(c7, 0, c5, 1)
|
||||||
#define ATS1CPR __ACCESS_CP15(c7, 0, c8, 0)
|
#define ATS1CPR __ACCESS_CP15(c7, 0, c8, 0)
|
||||||
#define TLBIALLIS __ACCESS_CP15(c8, 0, c3, 0)
|
#define TLBIALLIS __ACCESS_CP15(c8, 0, c3, 0)
|
||||||
#define TLBIALL __ACCESS_CP15(c8, 0, c7, 0)
|
#define TLBIALL __ACCESS_CP15(c8, 0, c7, 0)
|
||||||
|
|||||||
@@ -37,6 +37,8 @@
|
|||||||
|
|
||||||
#include <linux/highmem.h>
|
#include <linux/highmem.h>
|
||||||
#include <asm/cacheflush.h>
|
#include <asm/cacheflush.h>
|
||||||
|
#include <asm/cputype.h>
|
||||||
|
#include <asm/kvm_hyp.h>
|
||||||
#include <asm/pgalloc.h>
|
#include <asm/pgalloc.h>
|
||||||
#include <asm/stage2_pgtable.h>
|
#include <asm/stage2_pgtable.h>
|
||||||
|
|
||||||
@@ -83,6 +85,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
|
|||||||
return pmd;
|
return pmd;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline pte_t kvm_s2pte_mkexec(pte_t pte)
|
||||||
|
{
|
||||||
|
pte_val(pte) &= ~L_PTE_XN;
|
||||||
|
return pte;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
|
||||||
|
{
|
||||||
|
pmd_val(pmd) &= ~PMD_SECT_XN;
|
||||||
|
return pmd;
|
||||||
|
}
|
||||||
|
|
||||||
static inline void kvm_set_s2pte_readonly(pte_t *pte)
|
static inline void kvm_set_s2pte_readonly(pte_t *pte)
|
||||||
{
|
{
|
||||||
pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
|
pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
|
||||||
@@ -93,6 +107,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
|
|||||||
return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
|
return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline bool kvm_s2pte_exec(pte_t *pte)
|
||||||
|
{
|
||||||
|
return !(pte_val(*pte) & L_PTE_XN);
|
||||||
|
}
|
||||||
|
|
||||||
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
|
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
|
||||||
{
|
{
|
||||||
pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
|
pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
|
||||||
@@ -103,6 +122,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
|
|||||||
return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
|
return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline bool kvm_s2pmd_exec(pmd_t *pmd)
|
||||||
|
{
|
||||||
|
return !(pmd_val(*pmd) & PMD_SECT_XN);
|
||||||
|
}
|
||||||
|
|
||||||
static inline bool kvm_page_empty(void *ptr)
|
static inline bool kvm_page_empty(void *ptr)
|
||||||
{
|
{
|
||||||
struct page *ptr_page = virt_to_page(ptr);
|
struct page *ptr_page = virt_to_page(ptr);
|
||||||
@@ -126,21 +150,10 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
|
|||||||
return (vcpu_cp15(vcpu, c1_SCTLR) & 0b101) == 0b101;
|
return (vcpu_cp15(vcpu, c1_SCTLR) & 0b101) == 0b101;
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
|
static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
|
||||||
kvm_pfn_t pfn,
|
|
||||||
unsigned long size)
|
|
||||||
{
|
{
|
||||||
/*
|
/*
|
||||||
* If we are going to insert an instruction page and the icache is
|
* Clean the dcache to the Point of Coherency.
|
||||||
* either VIPT or PIPT, there is a potential problem where the host
|
|
||||||
* (or another VM) may have used the same page as this guest, and we
|
|
||||||
* read incorrect data from the icache. If we're using a PIPT cache,
|
|
||||||
* we can invalidate just that page, but if we are using a VIPT cache
|
|
||||||
* we need to invalidate the entire icache - damn shame - as written
|
|
||||||
* in the ARM ARM (DDI 0406C.b - Page B3-1393).
|
|
||||||
*
|
|
||||||
* VIVT caches are tagged using both the ASID and the VMID and doesn't
|
|
||||||
* need any kind of flushing (DDI 0406C.b - Page B3-1392).
|
|
||||||
*
|
*
|
||||||
* We need to do this through a kernel mapping (using the
|
* We need to do this through a kernel mapping (using the
|
||||||
* user-space mapping has proved to be the wrong
|
* user-space mapping has proved to be the wrong
|
||||||
@@ -155,9 +168,63 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
|
|||||||
|
|
||||||
kvm_flush_dcache_to_poc(va, PAGE_SIZE);
|
kvm_flush_dcache_to_poc(va, PAGE_SIZE);
|
||||||
|
|
||||||
if (icache_is_pipt())
|
size -= PAGE_SIZE;
|
||||||
__cpuc_coherent_user_range((unsigned long)va,
|
pfn++;
|
||||||
(unsigned long)va + PAGE_SIZE);
|
|
||||||
|
kunmap_atomic(va);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
|
||||||
|
unsigned long size)
|
||||||
|
{
|
||||||
|
u32 iclsz;
|
||||||
|
|
||||||
|
/*
|
||||||
|
* If we are going to insert an instruction page and the icache is
|
||||||
|
* either VIPT or PIPT, there is a potential problem where the host
|
||||||
|
* (or another VM) may have used the same page as this guest, and we
|
||||||
|
* read incorrect data from the icache. If we're using a PIPT cache,
|
||||||
|
* we can invalidate just that page, but if we are using a VIPT cache
|
||||||
|
* we need to invalidate the entire icache - damn shame - as written
|
||||||
|
* in the ARM ARM (DDI 0406C.b - Page B3-1393).
|
||||||
|
*
|
||||||
|
* VIVT caches are tagged using both the ASID and the VMID and doesn't
|
||||||
|
* need any kind of flushing (DDI 0406C.b - Page B3-1392).
|
||||||
|
*/
|
||||||
|
|
||||||
|
VM_BUG_ON(size & ~PAGE_MASK);
|
||||||
|
|
||||||
|
if (icache_is_vivt_asid_tagged())
|
||||||
|
return;
|
||||||
|
|
||||||
|
if (!icache_is_pipt()) {
|
||||||
|
/* any kind of VIPT cache */
|
||||||
|
__flush_icache_all();
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* CTR IminLine contains Log2 of the number of words in the
|
||||||
|
* cache line, so we can get the number of words as
|
||||||
|
* 2 << (IminLine - 1). To get the number of bytes, we
|
||||||
|
* multiply by 4 (the number of bytes in a 32-bit word), and
|
||||||
|
* get 4 << (IminLine).
|
||||||
|
*/
|
||||||
|
iclsz = 4 << (read_cpuid(CPUID_CACHETYPE) & 0xf);
|
||||||
|
|
||||||
|
while (size) {
|
||||||
|
void *va = kmap_atomic_pfn(pfn);
|
||||||
|
void *end = va + PAGE_SIZE;
|
||||||
|
void *addr = va;
|
||||||
|
|
||||||
|
do {
|
||||||
|
write_sysreg(addr, ICIMVAU);
|
||||||
|
addr += iclsz;
|
||||||
|
} while (addr < end);
|
||||||
|
|
||||||
|
dsb(ishst);
|
||||||
|
isb();
|
||||||
|
|
||||||
size -= PAGE_SIZE;
|
size -= PAGE_SIZE;
|
||||||
pfn++;
|
pfn++;
|
||||||
@@ -165,9 +232,11 @@ static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
|
|||||||
kunmap_atomic(va);
|
kunmap_atomic(va);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!icache_is_pipt() && !icache_is_vivt_asid_tagged()) {
|
/* Check if we need to invalidate the BTB */
|
||||||
/* any kind of VIPT cache */
|
if ((read_cpuid_ext(CPUID_EXT_MMFR1) >> 28) != 4) {
|
||||||
__flush_icache_all();
|
write_sysreg(0, BPIALLIS);
|
||||||
|
dsb(ishst);
|
||||||
|
isb();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -102,8 +102,8 @@ extern pgprot_t pgprot_s2_device;
|
|||||||
#define PAGE_HYP_EXEC _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY)
|
#define PAGE_HYP_EXEC _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY)
|
||||||
#define PAGE_HYP_RO _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY | L_PTE_XN)
|
#define PAGE_HYP_RO _MOD_PROT(pgprot_kernel, L_PTE_HYP | L_PTE_RDONLY | L_PTE_XN)
|
||||||
#define PAGE_HYP_DEVICE _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
|
#define PAGE_HYP_DEVICE _MOD_PROT(pgprot_hyp_device, L_PTE_HYP)
|
||||||
#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY)
|
#define PAGE_S2 _MOD_PROT(pgprot_s2, L_PTE_S2_RDONLY | L_PTE_XN)
|
||||||
#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY)
|
#define PAGE_S2_DEVICE _MOD_PROT(pgprot_s2_device, L_PTE_S2_RDONLY | L_PTE_XN)
|
||||||
|
|
||||||
#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
|
#define __PAGE_NONE __pgprot(_L_PTE_DEFAULT | L_PTE_RDONLY | L_PTE_XN | L_PTE_NONE)
|
||||||
#define __PAGE_SHARED __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
|
#define __PAGE_SHARED __pgprot(_L_PTE_DEFAULT | L_PTE_USER | L_PTE_XN)
|
||||||
|
|||||||
@@ -18,6 +18,7 @@
|
|||||||
|
|
||||||
#include <asm/kvm_asm.h>
|
#include <asm/kvm_asm.h>
|
||||||
#include <asm/kvm_hyp.h>
|
#include <asm/kvm_hyp.h>
|
||||||
|
#include <asm/kvm_mmu.h>
|
||||||
|
|
||||||
__asm__(".arch_extension virt");
|
__asm__(".arch_extension virt");
|
||||||
|
|
||||||
|
|||||||
@@ -19,6 +19,7 @@
|
|||||||
*/
|
*/
|
||||||
|
|
||||||
#include <asm/kvm_hyp.h>
|
#include <asm/kvm_hyp.h>
|
||||||
|
#include <asm/kvm_mmu.h>
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Flush per-VMID TLBs
|
* Flush per-VMID TLBs
|
||||||
|
|||||||
@@ -435,6 +435,27 @@ alternative_endif
|
|||||||
dsb \domain
|
dsb \domain
|
||||||
.endm
|
.endm
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Macro to perform an instruction cache maintenance for the interval
|
||||||
|
* [start, end)
|
||||||
|
*
|
||||||
|
* start, end: virtual addresses describing the region
|
||||||
|
* label: A label to branch to on user fault.
|
||||||
|
* Corrupts: tmp1, tmp2
|
||||||
|
*/
|
||||||
|
.macro invalidate_icache_by_line start, end, tmp1, tmp2, label
|
||||||
|
icache_line_size \tmp1, \tmp2
|
||||||
|
sub \tmp2, \tmp1, #1
|
||||||
|
bic \tmp2, \start, \tmp2
|
||||||
|
9997:
|
||||||
|
USER(\label, ic ivau, \tmp2) // invalidate I line PoU
|
||||||
|
add \tmp2, \tmp2, \tmp1
|
||||||
|
cmp \tmp2, \end
|
||||||
|
b.lo 9997b
|
||||||
|
dsb ish
|
||||||
|
isb
|
||||||
|
.endm
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
|
* reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
|
||||||
*/
|
*/
|
||||||
|
|||||||
@@ -52,6 +52,12 @@
|
|||||||
* - start - virtual start address
|
* - start - virtual start address
|
||||||
* - end - virtual end address
|
* - end - virtual end address
|
||||||
*
|
*
|
||||||
|
* invalidate_icache_range(start, end)
|
||||||
|
*
|
||||||
|
* Invalidate the I-cache in the region described by start, end.
|
||||||
|
* - start - virtual start address
|
||||||
|
* - end - virtual end address
|
||||||
|
*
|
||||||
* __flush_cache_user_range(start, end)
|
* __flush_cache_user_range(start, end)
|
||||||
*
|
*
|
||||||
* Ensure coherency between the I-cache and the D-cache in the
|
* Ensure coherency between the I-cache and the D-cache in the
|
||||||
@@ -66,6 +72,7 @@
|
|||||||
* - size - region size
|
* - size - region size
|
||||||
*/
|
*/
|
||||||
extern void flush_icache_range(unsigned long start, unsigned long end);
|
extern void flush_icache_range(unsigned long start, unsigned long end);
|
||||||
|
extern int invalidate_icache_range(unsigned long start, unsigned long end);
|
||||||
extern void __flush_dcache_area(void *addr, size_t len);
|
extern void __flush_dcache_area(void *addr, size_t len);
|
||||||
extern void __inval_dcache_area(void *addr, size_t len);
|
extern void __inval_dcache_area(void *addr, size_t len);
|
||||||
extern void __clean_dcache_area_poc(void *addr, size_t len);
|
extern void __clean_dcache_area_poc(void *addr, size_t len);
|
||||||
|
|||||||
@@ -48,6 +48,8 @@
|
|||||||
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
|
||||||
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
|
#define KVM_REQ_IRQ_PENDING KVM_ARCH_REQ(1)
|
||||||
|
|
||||||
|
DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
|
||||||
|
|
||||||
int __attribute_const__ kvm_target_cpu(void);
|
int __attribute_const__ kvm_target_cpu(void);
|
||||||
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
|
||||||
int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
|
int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, long ext);
|
||||||
|
|||||||
@@ -20,7 +20,6 @@
|
|||||||
|
|
||||||
#include <linux/compiler.h>
|
#include <linux/compiler.h>
|
||||||
#include <linux/kvm_host.h>
|
#include <linux/kvm_host.h>
|
||||||
#include <asm/kvm_mmu.h>
|
|
||||||
#include <asm/sysreg.h>
|
#include <asm/sysreg.h>
|
||||||
|
|
||||||
#define __hyp_text __section(.hyp.text) notrace
|
#define __hyp_text __section(.hyp.text) notrace
|
||||||
|
|||||||
@@ -173,6 +173,18 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
|
|||||||
return pmd;
|
return pmd;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline pte_t kvm_s2pte_mkexec(pte_t pte)
|
||||||
|
{
|
||||||
|
pte_val(pte) &= ~PTE_S2_XN;
|
||||||
|
return pte;
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
|
||||||
|
{
|
||||||
|
pmd_val(pmd) &= ~PMD_S2_XN;
|
||||||
|
return pmd;
|
||||||
|
}
|
||||||
|
|
||||||
static inline void kvm_set_s2pte_readonly(pte_t *pte)
|
static inline void kvm_set_s2pte_readonly(pte_t *pte)
|
||||||
{
|
{
|
||||||
pteval_t old_pteval, pteval;
|
pteval_t old_pteval, pteval;
|
||||||
@@ -191,6 +203,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
|
|||||||
return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
|
return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline bool kvm_s2pte_exec(pte_t *pte)
|
||||||
|
{
|
||||||
|
return !(pte_val(*pte) & PTE_S2_XN);
|
||||||
|
}
|
||||||
|
|
||||||
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
|
static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
|
||||||
{
|
{
|
||||||
kvm_set_s2pte_readonly((pte_t *)pmd);
|
kvm_set_s2pte_readonly((pte_t *)pmd);
|
||||||
@@ -201,6 +218,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
|
|||||||
return kvm_s2pte_readonly((pte_t *)pmd);
|
return kvm_s2pte_readonly((pte_t *)pmd);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
static inline bool kvm_s2pmd_exec(pmd_t *pmd)
|
||||||
|
{
|
||||||
|
return !(pmd_val(*pmd) & PMD_S2_XN);
|
||||||
|
}
|
||||||
|
|
||||||
static inline bool kvm_page_empty(void *ptr)
|
static inline bool kvm_page_empty(void *ptr)
|
||||||
{
|
{
|
||||||
struct page *ptr_page = virt_to_page(ptr);
|
struct page *ptr_page = virt_to_page(ptr);
|
||||||
@@ -230,21 +252,25 @@ static inline bool vcpu_has_cache_enabled(struct kvm_vcpu *vcpu)
|
|||||||
return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
|
return (vcpu_sys_reg(vcpu, SCTLR_EL1) & 0b101) == 0b101;
|
||||||
}
|
}
|
||||||
|
|
||||||
static inline void __coherent_cache_guest_page(struct kvm_vcpu *vcpu,
|
static inline void __clean_dcache_guest_page(kvm_pfn_t pfn, unsigned long size)
|
||||||
kvm_pfn_t pfn,
|
|
||||||
unsigned long size)
|
|
||||||
{
|
{
|
||||||
void *va = page_address(pfn_to_page(pfn));
|
void *va = page_address(pfn_to_page(pfn));
|
||||||
|
|
||||||
kvm_flush_dcache_to_poc(va, size);
|
kvm_flush_dcache_to_poc(va, size);
|
||||||
|
}
|
||||||
|
|
||||||
|
static inline void __invalidate_icache_guest_page(kvm_pfn_t pfn,
|
||||||
|
unsigned long size)
|
||||||
|
{
|
||||||
if (icache_is_aliasing()) {
|
if (icache_is_aliasing()) {
|
||||||
/* any kind of VIPT cache */
|
/* any kind of VIPT cache */
|
||||||
__flush_icache_all();
|
__flush_icache_all();
|
||||||
} else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
|
} else if (is_kernel_in_hyp_mode() || !icache_is_vpipt()) {
|
||||||
/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
|
/* PIPT or VPIPT at EL2 (see comment in __kvm_tlb_flush_vmid_ipa) */
|
||||||
flush_icache_range((unsigned long)va,
|
void *va = page_address(pfn_to_page(pfn));
|
||||||
(unsigned long)va + size);
|
|
||||||
|
invalidate_icache_range((unsigned long)va,
|
||||||
|
(unsigned long)va + size);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -187,9 +187,11 @@
|
|||||||
*/
|
*/
|
||||||
#define PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[2:1] */
|
#define PTE_S2_RDONLY (_AT(pteval_t, 1) << 6) /* HAP[2:1] */
|
||||||
#define PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
|
#define PTE_S2_RDWR (_AT(pteval_t, 3) << 6) /* HAP[2:1] */
|
||||||
|
#define PTE_S2_XN (_AT(pteval_t, 2) << 53) /* XN[1:0] */
|
||||||
|
|
||||||
#define PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[2:1] */
|
#define PMD_S2_RDONLY (_AT(pmdval_t, 1) << 6) /* HAP[2:1] */
|
||||||
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
|
#define PMD_S2_RDWR (_AT(pmdval_t, 3) << 6) /* HAP[2:1] */
|
||||||
|
#define PMD_S2_XN (_AT(pmdval_t, 2) << 53) /* XN[1:0] */
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Memory Attribute override for Stage-2 (MemAttr[3:0])
|
* Memory Attribute override for Stage-2 (MemAttr[3:0])
|
||||||
|
|||||||
@@ -67,8 +67,8 @@
|
|||||||
#define PAGE_HYP_RO __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
|
#define PAGE_HYP_RO __pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
|
||||||
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
|
#define PAGE_HYP_DEVICE __pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
|
||||||
|
|
||||||
#define PAGE_S2 __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
|
#define PAGE_S2 __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY | PTE_S2_XN)
|
||||||
#define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
|
#define PAGE_S2_DEVICE __pgprot(_PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_S2_XN)
|
||||||
|
|
||||||
#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
|
#define PAGE_NONE __pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
|
||||||
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
|
#define PAGE_SHARED __pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
|
||||||
|
|||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user