mirror of
https://github.com/armbian/linux-cix.git
synced 2026-01-06 12:30:45 -08:00
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"s390:
- More phys_to_virt conversions
- Improvement of AP management for VSIE (nested virtualization)
ARM64:
- Numerous fixes for the pathological lock inversion issue that
plagued KVM/arm64 since... forever.
- New framework allowing SMCCC-compliant hypercalls to be forwarded
to userspace, hopefully paving the way for some more features being
moved to VMMs rather than be implemented in the kernel.
- Large rework of the timer code to allow a VM-wide offset to be
applied to both virtual and physical counters as well as a
per-timer, per-vcpu offset that complements the global one. This
last part allows the NV timer code to be implemented on top.
- A small set of fixes to make sure that we don't change anything
affecting the EL1&0 translation regime just after having having
taken an exception to EL2 until we have executed a DSB. This
ensures that speculative walks started in EL1&0 have completed.
- The usual selftest fixes and improvements.
x86:
- Optimize CR0.WP toggling by avoiding an MMU reload when TDP is
enabled, and by giving the guest control of CR0.WP when EPT is
enabled on VMX (VMX-only because SVM doesn't support per-bit
controls)
- Add CR0/CR4 helpers to query single bits, and clean up related code
where KVM was interpreting kvm_read_cr4_bits()'s "unsigned long"
return as a bool
- Move AMD_PSFD to cpufeatures.h and purge KVM's definition
- Avoid unnecessary writes+flushes when the guest is only adding new
PTEs
- Overhaul .sync_page() and .invlpg() to utilize .sync_page()'s
optimizations when emulating invalidations
- Clean up the range-based flushing APIs
- Revamp the TDP MMU's reaping of Accessed/Dirty bits to clear a
single A/D bit using a LOCK AND instead of XCHG, and skip all of
the "handle changed SPTE" overhead associated with writing the
entire entry
- Track the number of "tail" entries in a pte_list_desc to avoid
having to walk (potentially) all descriptors during insertion and
deletion, which gets quite expensive if the guest is spamming
fork()
- Disallow virtualizing legacy LBRs if architectural LBRs are
available, the two are mutually exclusive in hardware
- Disallow writes to immutable feature MSRs (notably
PERF_CAPABILITIES) after KVM_RUN, similar to CPUID features
- Overhaul the vmx_pmu_caps selftest to better validate
PERF_CAPABILITIES
- Apply PMU filters to emulated events and add test coverage to the
pmu_event_filter selftest
- AMD SVM:
- Add support for virtual NMIs
- Fixes for edge cases related to virtual interrupts
- Intel AMX:
- Don't advertise XTILE_CFG in KVM_GET_SUPPORTED_CPUID if
XTILE_DATA is not being reported due to userspace not opting in
via prctl()
- Fix a bug in emulation of ENCLS in compatibility mode
- Allow emulation of NOP and PAUSE for L2
- AMX selftests improvements
- Misc cleanups
MIPS:
- Constify MIPS's internal callbacks (a leftover from the hardware
enabling rework that landed in 6.3)
Generic:
- Drop unnecessary casts from "void *" throughout kvm_main.c
- Tweak the layout of "struct kvm_mmu_memory_cache" to shrink the
struct size by 8 bytes on 64-bit kernels by utilizing a padding
hole
Documentation:
- Fix goof introduced by the conversion to rST"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (211 commits)
KVM: s390: pci: fix virtual-physical confusion on module unload/load
KVM: s390: vsie: clarifications on setting the APCB
KVM: s390: interrupt: fix virtual-physical confusion for next alert GISA
KVM: arm64: Have kvm_psci_vcpu_on() use WRITE_ONCE() to update mp_state
KVM: arm64: Acquire mp_state_lock in kvm_arch_vcpu_ioctl_vcpu_init()
KVM: selftests: Test the PMU event "Instructions retired"
KVM: selftests: Copy full counter values from guest in PMU event filter test
KVM: selftests: Use error codes to signal errors in PMU event filter test
KVM: selftests: Print detailed info in PMU event filter asserts
KVM: selftests: Add helpers for PMC asserts in PMU event filter test
KVM: selftests: Add a common helper for the PMU event filter guest code
KVM: selftests: Fix spelling mistake "perrmited" -> "permitted"
KVM: arm64: vhe: Drop extra isb() on guest exit
KVM: arm64: vhe: Synchronise with page table walker on MMU update
KVM: arm64: pkvm: Document the side effects of kvm_flush_dcache_to_poc()
KVM: arm64: nvhe: Synchronise with page table walker on TLBI
KVM: arm64: Handle 32bit CNTPCTSS traps
KVM: arm64: nvhe: Synchronise with page table walker on vcpu run
KVM: arm64: vgic: Don't acquire its_lock before config_lock
KVM: selftests: Add test to verify KVM's supported XCR0
...
This commit is contained in:
@@ -5645,7 +5645,8 @@ with the KVM_XEN_VCPU_GET_ATTR ioctl.
|
||||
};
|
||||
|
||||
Copies Memory Tagging Extension (MTE) tags to/from guest tag memory. The
|
||||
``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned. The ``addr``
|
||||
``guest_ipa`` and ``length`` fields must be ``PAGE_SIZE`` aligned.
|
||||
``length`` must not be bigger than 2^31 - PAGE_SIZE bytes. The ``addr``
|
||||
field must point to a buffer which the tags will be copied to or from.
|
||||
|
||||
``flags`` specifies the direction of copy, either ``KVM_ARM_TAGS_TO_GUEST`` or
|
||||
@@ -6029,6 +6030,44 @@ delivery must be provided via the "reg_aen" struct.
|
||||
The "pad" and "reserved" fields may be used for future extensions and should be
|
||||
set to 0s by userspace.
|
||||
|
||||
4.138 KVM_ARM_SET_COUNTER_OFFSET
|
||||
--------------------------------
|
||||
|
||||
:Capability: KVM_CAP_COUNTER_OFFSET
|
||||
:Architectures: arm64
|
||||
:Type: vm ioctl
|
||||
:Parameters: struct kvm_arm_counter_offset (in)
|
||||
:Returns: 0 on success, < 0 on error
|
||||
|
||||
This capability indicates that userspace is able to apply a single VM-wide
|
||||
offset to both the virtual and physical counters as viewed by the guest
|
||||
using the KVM_ARM_SET_CNT_OFFSET ioctl and the following data structure:
|
||||
|
||||
::
|
||||
|
||||
struct kvm_arm_counter_offset {
|
||||
__u64 counter_offset;
|
||||
__u64 reserved;
|
||||
};
|
||||
|
||||
The offset describes a number of counter cycles that are subtracted from
|
||||
both virtual and physical counter views (similar to the effects of the
|
||||
CNTVOFF_EL2 and CNTPOFF_EL2 system registers, but only global). The offset
|
||||
always applies to all vcpus (already created or created after this ioctl)
|
||||
for this VM.
|
||||
|
||||
It is userspace's responsibility to compute the offset based, for example,
|
||||
on previous values of the guest counters.
|
||||
|
||||
Any value other than 0 for the "reserved" field may result in an error
|
||||
(-EINVAL) being returned. This ioctl can also return -EBUSY if any vcpu
|
||||
ioctl is issued concurrently.
|
||||
|
||||
Note that using this ioctl results in KVM ignoring subsequent userspace
|
||||
writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG
|
||||
interface. No error will be returned, but the resulting offset will not be
|
||||
applied.
|
||||
|
||||
5. The kvm_run structure
|
||||
========================
|
||||
|
||||
@@ -6218,15 +6257,40 @@ to the byte array.
|
||||
__u64 nr;
|
||||
__u64 args[6];
|
||||
__u64 ret;
|
||||
__u32 longmode;
|
||||
__u32 pad;
|
||||
__u64 flags;
|
||||
} hypercall;
|
||||
|
||||
Unused. This was once used for 'hypercall to userspace'. To implement
|
||||
such functionality, use KVM_EXIT_IO (x86) or KVM_EXIT_MMIO (all except s390).
|
||||
|
||||
It is strongly recommended that userspace use ``KVM_EXIT_IO`` (x86) or
|
||||
``KVM_EXIT_MMIO`` (all except s390) to implement functionality that
|
||||
requires a guest to interact with host userpace.
|
||||
|
||||
.. note:: KVM_EXIT_IO is significantly faster than KVM_EXIT_MMIO.
|
||||
|
||||
For arm64:
|
||||
----------
|
||||
|
||||
SMCCC exits can be enabled depending on the configuration of the SMCCC
|
||||
filter. See the Documentation/virt/kvm/devices/vm.rst
|
||||
``KVM_ARM_SMCCC_FILTER`` for more details.
|
||||
|
||||
``nr`` contains the function ID of the guest's SMCCC call. Userspace is
|
||||
expected to use the ``KVM_GET_ONE_REG`` ioctl to retrieve the call
|
||||
parameters from the vCPU's GPRs.
|
||||
|
||||
Definition of ``flags``:
|
||||
- ``KVM_HYPERCALL_EXIT_SMC``: Indicates that the guest used the SMC
|
||||
conduit to initiate the SMCCC call. If this bit is 0 then the guest
|
||||
used the HVC conduit for the SMCCC call.
|
||||
|
||||
- ``KVM_HYPERCALL_EXIT_16BIT``: Indicates that the guest used a 16bit
|
||||
instruction to initiate the SMCCC call. If this bit is 0 then the
|
||||
guest used a 32bit instruction. An AArch64 guest always has this
|
||||
bit set to 0.
|
||||
|
||||
At the point of exit, PC points to the instruction immediately following
|
||||
the trapping instruction.
|
||||
|
||||
::
|
||||
|
||||
/* KVM_EXIT_TPR_ACCESS */
|
||||
@@ -7266,6 +7330,7 @@ and injected exceptions.
|
||||
will clear DR6.RTM.
|
||||
|
||||
7.18 KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
|
||||
--------------------------------------
|
||||
|
||||
:Architectures: x86, arm64, mips
|
||||
:Parameters: args[0] whether feature should be enabled or not
|
||||
|
||||
@@ -321,3 +321,82 @@ Allows userspace to query the status of migration mode.
|
||||
if it is enabled
|
||||
:Returns: -EFAULT if the given address is not accessible from kernel space;
|
||||
0 in case of success.
|
||||
|
||||
6. GROUP: KVM_ARM_VM_SMCCC_CTRL
|
||||
===============================
|
||||
|
||||
:Architectures: arm64
|
||||
|
||||
6.1. ATTRIBUTE: KVM_ARM_VM_SMCCC_FILTER (w/o)
|
||||
---------------------------------------------
|
||||
|
||||
:Parameters: Pointer to a ``struct kvm_smccc_filter``
|
||||
|
||||
:Returns:
|
||||
|
||||
====== ===========================================
|
||||
EEXIST Range intersects with a previously inserted
|
||||
or reserved range
|
||||
EBUSY A vCPU in the VM has already run
|
||||
EINVAL Invalid filter configuration
|
||||
ENOMEM Failed to allocate memory for the in-kernel
|
||||
representation of the SMCCC filter
|
||||
====== ===========================================
|
||||
|
||||
Requests the installation of an SMCCC call filter described as follows::
|
||||
|
||||
enum kvm_smccc_filter_action {
|
||||
KVM_SMCCC_FILTER_HANDLE = 0,
|
||||
KVM_SMCCC_FILTER_DENY,
|
||||
KVM_SMCCC_FILTER_FWD_TO_USER,
|
||||
};
|
||||
|
||||
struct kvm_smccc_filter {
|
||||
__u32 base;
|
||||
__u32 nr_functions;
|
||||
__u8 action;
|
||||
__u8 pad[15];
|
||||
};
|
||||
|
||||
The filter is defined as a set of non-overlapping ranges. Each
|
||||
range defines an action to be applied to SMCCC calls within the range.
|
||||
Userspace can insert multiple ranges into the filter by using
|
||||
successive calls to this attribute.
|
||||
|
||||
The default configuration of KVM is such that all implemented SMCCC
|
||||
calls are allowed. Thus, the SMCCC filter can be defined sparsely
|
||||
by userspace, only describing ranges that modify the default behavior.
|
||||
|
||||
The range expressed by ``struct kvm_smccc_filter`` is
|
||||
[``base``, ``base + nr_functions``). The range is not allowed to wrap,
|
||||
i.e. userspace cannot rely on ``base + nr_functions`` overflowing.
|
||||
|
||||
The SMCCC filter applies to both SMC and HVC calls initiated by the
|
||||
guest. The SMCCC filter gates the in-kernel emulation of SMCCC calls
|
||||
and as such takes effect before other interfaces that interact with
|
||||
SMCCC calls (e.g. hypercall bitmap registers).
|
||||
|
||||
Actions:
|
||||
|
||||
- ``KVM_SMCCC_FILTER_HANDLE``: Allows the guest SMCCC call to be
|
||||
handled in-kernel. It is strongly recommended that userspace *not*
|
||||
explicitly describe the allowed SMCCC call ranges.
|
||||
|
||||
- ``KVM_SMCCC_FILTER_DENY``: Rejects the guest SMCCC call in-kernel
|
||||
and returns to the guest.
|
||||
|
||||
- ``KVM_SMCCC_FILTER_FWD_TO_USER``: The guest SMCCC call is forwarded
|
||||
to userspace with an exit reason of ``KVM_EXIT_HYPERCALL``.
|
||||
|
||||
The ``pad`` field is reserved for future use and must be zero. KVM may
|
||||
return ``-EINVAL`` if the field is nonzero.
|
||||
|
||||
KVM reserves the 'Arm Architecture Calls' range of function IDs and
|
||||
will reject attempts to define a filter for any portion of these ranges:
|
||||
|
||||
=========== ===============
|
||||
Start End (inclusive)
|
||||
=========== ===============
|
||||
0x8000_0000 0x8000_FFFF
|
||||
0xC000_0000 0xC000_FFFF
|
||||
=========== ===============
|
||||
|
||||
@@ -21,7 +21,7 @@ The acquisition orders for mutexes are as follows:
|
||||
- kvm->mn_active_invalidate_count ensures that pairs of
|
||||
invalidate_range_start() and invalidate_range_end() callbacks
|
||||
use the same memslots array. kvm->slots_lock and kvm->slots_arch_lock
|
||||
are taken on the waiting side in install_new_memslots, so MMU notifiers
|
||||
are taken on the waiting side when modifying memslots, so MMU notifiers
|
||||
must not take either kvm->slots_lock or kvm->slots_arch_lock.
|
||||
|
||||
For SRCU:
|
||||
|
||||
@@ -16,6 +16,7 @@
|
||||
#include <linux/types.h>
|
||||
#include <linux/jump_label.h>
|
||||
#include <linux/kvm_types.h>
|
||||
#include <linux/maple_tree.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/psci.h>
|
||||
#include <asm/arch_gicv3.h>
|
||||
@@ -199,6 +200,9 @@ struct kvm_arch {
|
||||
/* Mandated version of PSCI */
|
||||
u32 psci_version;
|
||||
|
||||
/* Protects VM-scoped configuration data */
|
||||
struct mutex config_lock;
|
||||
|
||||
/*
|
||||
* If we encounter a data abort without valid instruction syndrome
|
||||
* information, report this to user space. User space can (and
|
||||
@@ -221,7 +225,12 @@ struct kvm_arch {
|
||||
#define KVM_ARCH_FLAG_EL1_32BIT 4
|
||||
/* PSCI SYSTEM_SUSPEND enabled for the guest */
|
||||
#define KVM_ARCH_FLAG_SYSTEM_SUSPEND_ENABLED 5
|
||||
|
||||
/* VM counter offset */
|
||||
#define KVM_ARCH_FLAG_VM_COUNTER_OFFSET 6
|
||||
/* Timer PPIs made immutable */
|
||||
#define KVM_ARCH_FLAG_TIMER_PPIS_IMMUTABLE 7
|
||||
/* SMCCC filter initialized for the VM */
|
||||
#define KVM_ARCH_FLAG_SMCCC_FILTER_CONFIGURED 8
|
||||
unsigned long flags;
|
||||
|
||||
/*
|
||||
@@ -242,6 +251,7 @@ struct kvm_arch {
|
||||
|
||||
/* Hypercall features firmware registers' descriptor */
|
||||
struct kvm_smccc_features smccc_feat;
|
||||
struct maple_tree smccc_filter;
|
||||
|
||||
/*
|
||||
* For an untrusted host VM, 'pkvm.handle' is used to lookup
|
||||
@@ -365,6 +375,10 @@ enum vcpu_sysreg {
|
||||
TPIDR_EL2, /* EL2 Software Thread ID Register */
|
||||
CNTHCTL_EL2, /* Counter-timer Hypervisor Control register */
|
||||
SP_EL2, /* EL2 Stack Pointer */
|
||||
CNTHP_CTL_EL2,
|
||||
CNTHP_CVAL_EL2,
|
||||
CNTHV_CTL_EL2,
|
||||
CNTHV_CVAL_EL2,
|
||||
|
||||
NR_SYS_REGS /* Nothing after this line! */
|
||||
};
|
||||
@@ -522,6 +536,7 @@ struct kvm_vcpu_arch {
|
||||
|
||||
/* vcpu power state */
|
||||
struct kvm_mp_state mp_state;
|
||||
spinlock_t mp_state_lock;
|
||||
|
||||
/* Cache some mmu pages needed inside spinlock regions */
|
||||
struct kvm_mmu_memory_cache mmu_page_cache;
|
||||
@@ -939,6 +954,9 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu);
|
||||
|
||||
int __init kvm_sys_reg_table_init(void);
|
||||
|
||||
bool lock_all_vcpus(struct kvm *kvm);
|
||||
void unlock_all_vcpus(struct kvm *kvm);
|
||||
|
||||
/* MMIO helpers */
|
||||
void kvm_mmio_write_buf(void *buf, unsigned int len, unsigned long data);
|
||||
unsigned long kvm_mmio_read_buf(const void *buf, unsigned int len);
|
||||
@@ -1022,8 +1040,10 @@ int kvm_arm_vcpu_arch_get_attr(struct kvm_vcpu *vcpu,
|
||||
int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
|
||||
struct kvm_device_attr *attr);
|
||||
|
||||
long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
|
||||
struct kvm_arm_copy_mte_tags *copy_tags);
|
||||
int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
|
||||
struct kvm_arm_copy_mte_tags *copy_tags);
|
||||
int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm,
|
||||
struct kvm_arm_counter_offset *offset);
|
||||
|
||||
/* Guest/host FPSIMD coordination helpers */
|
||||
int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu);
|
||||
@@ -1078,6 +1098,9 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
|
||||
(system_supports_32bit_el0() && \
|
||||
!static_branch_unlikely(&arm64_mismatched_32bit_el0))
|
||||
|
||||
#define kvm_vm_has_ran_once(kvm) \
|
||||
(test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &(kvm)->arch.flags))
|
||||
|
||||
int kvm_trng_call(struct kvm_vcpu *vcpu);
|
||||
#ifdef CONFIG_KVM
|
||||
extern phys_addr_t hyp_mem_base;
|
||||
|
||||
@@ -63,6 +63,7 @@
|
||||
* specific registers encoded in the instructions).
|
||||
*/
|
||||
.macro kern_hyp_va reg
|
||||
#ifndef __KVM_VHE_HYPERVISOR__
|
||||
alternative_cb ARM64_ALWAYS_SYSTEM, kvm_update_va_mask
|
||||
and \reg, \reg, #1 /* mask with va_mask */
|
||||
ror \reg, \reg, #1 /* rotate to the first tag bit */
|
||||
@@ -70,6 +71,7 @@ alternative_cb ARM64_ALWAYS_SYSTEM, kvm_update_va_mask
|
||||
add \reg, \reg, #0, lsl 12 /* insert the top 12 bits of the tag */
|
||||
ror \reg, \reg, #63 /* rotate back */
|
||||
alternative_cb_end
|
||||
#endif
|
||||
.endm
|
||||
|
||||
/*
|
||||
@@ -127,6 +129,7 @@ void kvm_apply_hyp_relocations(void);
|
||||
|
||||
static __always_inline unsigned long __kern_hyp_va(unsigned long v)
|
||||
{
|
||||
#ifndef __KVM_VHE_HYPERVISOR__
|
||||
asm volatile(ALTERNATIVE_CB("and %0, %0, #1\n"
|
||||
"ror %0, %0, #1\n"
|
||||
"add %0, %0, #0\n"
|
||||
@@ -135,6 +138,7 @@ static __always_inline unsigned long __kern_hyp_va(unsigned long v)
|
||||
ARM64_ALWAYS_SYSTEM,
|
||||
kvm_update_va_mask)
|
||||
: "+r" (v));
|
||||
#endif
|
||||
return v;
|
||||
}
|
||||
|
||||
|
||||
@@ -388,6 +388,7 @@
|
||||
|
||||
#define SYS_CNTFRQ_EL0 sys_reg(3, 3, 14, 0, 0)
|
||||
|
||||
#define SYS_CNTPCT_EL0 sys_reg(3, 3, 14, 0, 1)
|
||||
#define SYS_CNTPCTSS_EL0 sys_reg(3, 3, 14, 0, 5)
|
||||
#define SYS_CNTVCTSS_EL0 sys_reg(3, 3, 14, 0, 6)
|
||||
|
||||
@@ -400,7 +401,9 @@
|
||||
|
||||
#define SYS_AARCH32_CNTP_TVAL sys_reg(0, 0, 14, 2, 0)
|
||||
#define SYS_AARCH32_CNTP_CTL sys_reg(0, 0, 14, 2, 1)
|
||||
#define SYS_AARCH32_CNTPCT sys_reg(0, 0, 0, 14, 0)
|
||||
#define SYS_AARCH32_CNTP_CVAL sys_reg(0, 2, 0, 14, 0)
|
||||
#define SYS_AARCH32_CNTPCTSS sys_reg(0, 8, 0, 14, 0)
|
||||
|
||||
#define __PMEV_op2(n) ((n) & 0x7)
|
||||
#define __CNTR_CRm(n) (0x8 | (((n) >> 3) & 0x3))
|
||||
|
||||
@@ -198,6 +198,15 @@ struct kvm_arm_copy_mte_tags {
|
||||
__u64 reserved[2];
|
||||
};
|
||||
|
||||
/*
|
||||
* Counter/Timer offset structure. Describe the virtual/physical offset.
|
||||
* To be used with KVM_ARM_SET_COUNTER_OFFSET.
|
||||
*/
|
||||
struct kvm_arm_counter_offset {
|
||||
__u64 counter_offset;
|
||||
__u64 reserved;
|
||||
};
|
||||
|
||||
#define KVM_ARM_TAGS_TO_GUEST 0
|
||||
#define KVM_ARM_TAGS_FROM_GUEST 1
|
||||
|
||||
@@ -372,6 +381,10 @@ enum {
|
||||
#endif
|
||||
};
|
||||
|
||||
/* Device Control API on vm fd */
|
||||
#define KVM_ARM_VM_SMCCC_CTRL 0
|
||||
#define KVM_ARM_VM_SMCCC_FILTER 0
|
||||
|
||||
/* Device Control API: ARM VGIC */
|
||||
#define KVM_DEV_ARM_VGIC_GRP_ADDR 0
|
||||
#define KVM_DEV_ARM_VGIC_GRP_DIST_REGS 1
|
||||
@@ -411,6 +424,8 @@ enum {
|
||||
#define KVM_ARM_VCPU_TIMER_CTRL 1
|
||||
#define KVM_ARM_VCPU_TIMER_IRQ_VTIMER 0
|
||||
#define KVM_ARM_VCPU_TIMER_IRQ_PTIMER 1
|
||||
#define KVM_ARM_VCPU_TIMER_IRQ_HVTIMER 2
|
||||
#define KVM_ARM_VCPU_TIMER_IRQ_HPTIMER 3
|
||||
#define KVM_ARM_VCPU_PVTIME_CTRL 2
|
||||
#define KVM_ARM_VCPU_PVTIME_IPA 0
|
||||
|
||||
@@ -469,6 +484,27 @@ enum {
|
||||
/* run->fail_entry.hardware_entry_failure_reason codes. */
|
||||
#define KVM_EXIT_FAIL_ENTRY_CPU_UNSUPPORTED (1ULL << 0)
|
||||
|
||||
enum kvm_smccc_filter_action {
|
||||
KVM_SMCCC_FILTER_HANDLE = 0,
|
||||
KVM_SMCCC_FILTER_DENY,
|
||||
KVM_SMCCC_FILTER_FWD_TO_USER,
|
||||
|
||||
#ifdef __KERNEL__
|
||||
NR_SMCCC_FILTER_ACTIONS
|
||||
#endif
|
||||
};
|
||||
|
||||
struct kvm_smccc_filter {
|
||||
__u32 base;
|
||||
__u32 nr_functions;
|
||||
__u8 action;
|
||||
__u8 pad[15];
|
||||
};
|
||||
|
||||
/* arm64-specific KVM_EXIT_HYPERCALL flags */
|
||||
#define KVM_HYPERCALL_EXIT_SMC (1U << 0)
|
||||
#define KVM_HYPERCALL_EXIT_16BIT (1U << 1)
|
||||
|
||||
#endif
|
||||
|
||||
#endif /* __ARM_KVM_H__ */
|
||||
|
||||
@@ -2230,6 +2230,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
|
||||
.matches = has_cpuid_feature,
|
||||
ARM64_CPUID_FIELDS(ID_AA64MMFR0_EL1, ECV, IMP)
|
||||
},
|
||||
{
|
||||
.desc = "Enhanced Counter Virtualization (CNTPOFF)",
|
||||
.capability = ARM64_HAS_ECV_CNTPOFF,
|
||||
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
|
||||
.matches = has_cpuid_feature,
|
||||
.sys_reg = SYS_ID_AA64MMFR0_EL1,
|
||||
.field_pos = ID_AA64MMFR0_EL1_ECV_SHIFT,
|
||||
.field_width = 4,
|
||||
.sign = FTR_UNSIGNED,
|
||||
.min_field_value = ID_AA64MMFR0_EL1_ECV_CNTPOFF,
|
||||
},
|
||||
#ifdef CONFIG_ARM64_PAN
|
||||
{
|
||||
.desc = "Privileged Access Never",
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -126,6 +126,16 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
|
||||
{
|
||||
int ret;
|
||||
|
||||
mutex_init(&kvm->arch.config_lock);
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
/* Clue in lockdep that the config_lock must be taken inside kvm->lock */
|
||||
mutex_lock(&kvm->lock);
|
||||
mutex_lock(&kvm->arch.config_lock);
|
||||
mutex_unlock(&kvm->arch.config_lock);
|
||||
mutex_unlock(&kvm->lock);
|
||||
#endif
|
||||
|
||||
ret = kvm_share_hyp(kvm, kvm + 1);
|
||||
if (ret)
|
||||
return ret;
|
||||
@@ -146,6 +156,8 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
|
||||
|
||||
kvm_vgic_early_init(kvm);
|
||||
|
||||
kvm_timer_init_vm(kvm);
|
||||
|
||||
/* The maximum number of VCPUs is limited by the host's GIC model */
|
||||
kvm->max_vcpus = kvm_arm_default_max_vcpus();
|
||||
|
||||
@@ -190,6 +202,8 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
|
||||
kvm_destroy_vcpus(kvm);
|
||||
|
||||
kvm_unshare_hyp(kvm, kvm + 1);
|
||||
|
||||
kvm_arm_teardown_hypercalls(kvm);
|
||||
}
|
||||
|
||||
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
@@ -219,6 +233,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
|
||||
case KVM_CAP_PTP_KVM:
|
||||
case KVM_CAP_ARM_SYSTEM_SUSPEND:
|
||||
case KVM_CAP_IRQFD_RESAMPLE:
|
||||
case KVM_CAP_COUNTER_OFFSET:
|
||||
r = 1;
|
||||
break;
|
||||
case KVM_CAP_SET_GUEST_DEBUG2:
|
||||
@@ -325,6 +340,16 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
int err;
|
||||
|
||||
spin_lock_init(&vcpu->arch.mp_state_lock);
|
||||
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
/* Inform lockdep that the config_lock is acquired after vcpu->mutex */
|
||||
mutex_lock(&vcpu->mutex);
|
||||
mutex_lock(&vcpu->kvm->arch.config_lock);
|
||||
mutex_unlock(&vcpu->kvm->arch.config_lock);
|
||||
mutex_unlock(&vcpu->mutex);
|
||||
#endif
|
||||
|
||||
/* Force users to call KVM_ARM_VCPU_INIT */
|
||||
vcpu->arch.target = -1;
|
||||
bitmap_zero(vcpu->arch.features, KVM_VCPU_MAX_FEATURES);
|
||||
@@ -442,34 +467,41 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
|
||||
vcpu->cpu = -1;
|
||||
}
|
||||
|
||||
void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
|
||||
static void __kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu->arch.mp_state.mp_state = KVM_MP_STATE_STOPPED;
|
||||
WRITE_ONCE(vcpu->arch.mp_state.mp_state, KVM_MP_STATE_STOPPED);
|
||||
kvm_make_request(KVM_REQ_SLEEP, vcpu);
|
||||
kvm_vcpu_kick(vcpu);
|
||||
}
|
||||
|
||||
void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
spin_lock(&vcpu->arch.mp_state_lock);
|
||||
__kvm_arm_vcpu_power_off(vcpu);
|
||||
spin_unlock(&vcpu->arch.mp_state_lock);
|
||||
}
|
||||
|
||||
bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_STOPPED;
|
||||
return READ_ONCE(vcpu->arch.mp_state.mp_state) == KVM_MP_STATE_STOPPED;
|
||||
}
|
||||
|
||||
static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED;
|
||||
WRITE_ONCE(vcpu->arch.mp_state.mp_state, KVM_MP_STATE_SUSPENDED);
|
||||
kvm_make_request(KVM_REQ_SUSPEND, vcpu);
|
||||
kvm_vcpu_kick(vcpu);
|
||||
}
|
||||
|
||||
static bool kvm_arm_vcpu_suspended(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_SUSPENDED;
|
||||
return READ_ONCE(vcpu->arch.mp_state.mp_state) == KVM_MP_STATE_SUSPENDED;
|
||||
}
|
||||
|
||||
int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu,
|
||||
struct kvm_mp_state *mp_state)
|
||||
{
|
||||
*mp_state = vcpu->arch.mp_state;
|
||||
*mp_state = READ_ONCE(vcpu->arch.mp_state);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@@ -479,12 +511,14 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
|
||||
{
|
||||
int ret = 0;
|
||||
|
||||
spin_lock(&vcpu->arch.mp_state_lock);
|
||||
|
||||
switch (mp_state->mp_state) {
|
||||
case KVM_MP_STATE_RUNNABLE:
|
||||
vcpu->arch.mp_state = *mp_state;
|
||||
WRITE_ONCE(vcpu->arch.mp_state, *mp_state);
|
||||
break;
|
||||
case KVM_MP_STATE_STOPPED:
|
||||
kvm_arm_vcpu_power_off(vcpu);
|
||||
__kvm_arm_vcpu_power_off(vcpu);
|
||||
break;
|
||||
case KVM_MP_STATE_SUSPENDED:
|
||||
kvm_arm_vcpu_suspend(vcpu);
|
||||
@@ -493,6 +527,8 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
|
||||
ret = -EINVAL;
|
||||
}
|
||||
|
||||
spin_unlock(&vcpu->arch.mp_state_lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
@@ -592,9 +628,9 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
|
||||
if (kvm_vm_is_protected(kvm))
|
||||
kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
|
||||
|
||||
mutex_lock(&kvm->lock);
|
||||
mutex_lock(&kvm->arch.config_lock);
|
||||
set_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags);
|
||||
mutex_unlock(&kvm->lock);
|
||||
mutex_unlock(&kvm->arch.config_lock);
|
||||
|
||||
return ret;
|
||||
}
|
||||
@@ -1209,10 +1245,14 @@ static int kvm_arch_vcpu_ioctl_vcpu_init(struct kvm_vcpu *vcpu,
|
||||
/*
|
||||
* Handle the "start in power-off" case.
|
||||
*/
|
||||
spin_lock(&vcpu->arch.mp_state_lock);
|
||||
|
||||
if (test_bit(KVM_ARM_VCPU_POWER_OFF, vcpu->arch.features))
|
||||
kvm_arm_vcpu_power_off(vcpu);
|
||||
__kvm_arm_vcpu_power_off(vcpu);
|
||||
else
|
||||
vcpu->arch.mp_state.mp_state = KVM_MP_STATE_RUNNABLE;
|
||||
WRITE_ONCE(vcpu->arch.mp_state.mp_state, KVM_MP_STATE_RUNNABLE);
|
||||
|
||||
spin_unlock(&vcpu->arch.mp_state_lock);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@@ -1438,11 +1478,31 @@ static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
|
||||
}
|
||||
}
|
||||
|
||||
long kvm_arch_vm_ioctl(struct file *filp,
|
||||
unsigned int ioctl, unsigned long arg)
|
||||
static int kvm_vm_has_attr(struct kvm *kvm, struct kvm_device_attr *attr)
|
||||
{
|
||||
switch (attr->group) {
|
||||
case KVM_ARM_VM_SMCCC_CTRL:
|
||||
return kvm_vm_smccc_has_attr(kvm, attr);
|
||||
default:
|
||||
return -ENXIO;
|
||||
}
|
||||
}
|
||||
|
||||
static int kvm_vm_set_attr(struct kvm *kvm, struct kvm_device_attr *attr)
|
||||
{
|
||||
switch (attr->group) {
|
||||
case KVM_ARM_VM_SMCCC_CTRL:
|
||||
return kvm_vm_smccc_set_attr(kvm, attr);
|
||||
default:
|
||||
return -ENXIO;
|
||||
}
|
||||
}
|
||||
|
||||
int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
|
||||
{
|
||||
struct kvm *kvm = filp->private_data;
|
||||
void __user *argp = (void __user *)arg;
|
||||
struct kvm_device_attr attr;
|
||||
|
||||
switch (ioctl) {
|
||||
case KVM_CREATE_IRQCHIP: {
|
||||
@@ -1478,11 +1538,73 @@ long kvm_arch_vm_ioctl(struct file *filp,
|
||||
return -EFAULT;
|
||||
return kvm_vm_ioctl_mte_copy_tags(kvm, ©_tags);
|
||||
}
|
||||
case KVM_ARM_SET_COUNTER_OFFSET: {
|
||||
struct kvm_arm_counter_offset offset;
|
||||
|
||||
if (copy_from_user(&offset, argp, sizeof(offset)))
|
||||
return -EFAULT;
|
||||
return kvm_vm_ioctl_set_counter_offset(kvm, &offset);
|
||||
}
|
||||
case KVM_HAS_DEVICE_ATTR: {
|
||||
if (copy_from_user(&attr, argp, sizeof(attr)))
|
||||
return -EFAULT;
|
||||
|
||||
return kvm_vm_has_attr(kvm, &attr);
|
||||
}
|
||||
case KVM_SET_DEVICE_ATTR: {
|
||||
if (copy_from_user(&attr, argp, sizeof(attr)))
|
||||
return -EFAULT;
|
||||
|
||||
return kvm_vm_set_attr(kvm, &attr);
|
||||
}
|
||||
default:
|
||||
return -EINVAL;
|
||||
}
|
||||
}
|
||||
|
||||
/* unlocks vcpus from @vcpu_lock_idx and smaller */
|
||||
static void unlock_vcpus(struct kvm *kvm, int vcpu_lock_idx)
|
||||
{
|
||||
struct kvm_vcpu *tmp_vcpu;
|
||||
|
||||
for (; vcpu_lock_idx >= 0; vcpu_lock_idx--) {
|
||||
tmp_vcpu = kvm_get_vcpu(kvm, vcpu_lock_idx);
|
||||
mutex_unlock(&tmp_vcpu->mutex);
|
||||
}
|
||||
}
|
||||
|
||||
void unlock_all_vcpus(struct kvm *kvm)
|
||||
{
|
||||
lockdep_assert_held(&kvm->lock);
|
||||
|
||||
unlock_vcpus(kvm, atomic_read(&kvm->online_vcpus) - 1);
|
||||
}
|
||||
|
||||
/* Returns true if all vcpus were locked, false otherwise */
|
||||
bool lock_all_vcpus(struct kvm *kvm)
|
||||
{
|
||||
struct kvm_vcpu *tmp_vcpu;
|
||||
unsigned long c;
|
||||
|
||||
lockdep_assert_held(&kvm->lock);
|
||||
|
||||
/*
|
||||
* Any time a vcpu is in an ioctl (including running), the
|
||||
* core KVM code tries to grab the vcpu->mutex.
|
||||
*
|
||||
* By grabbing the vcpu->mutex of all VCPUs we ensure that no
|
||||
* other VCPUs can fiddle with the state while we access it.
|
||||
*/
|
||||
kvm_for_each_vcpu(c, tmp_vcpu, kvm) {
|
||||
if (!mutex_trylock(&tmp_vcpu->mutex)) {
|
||||
unlock_vcpus(kvm, c - 1);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static unsigned long nvhe_percpu_size(void)
|
||||
{
|
||||
return (unsigned long)CHOOSE_NVHE_SYM(__per_cpu_end) -
|
||||
|
||||
@@ -590,11 +590,16 @@ static unsigned long num_core_regs(const struct kvm_vcpu *vcpu)
|
||||
return copy_core_reg_indices(vcpu, NULL);
|
||||
}
|
||||
|
||||
/**
|
||||
* ARM64 versions of the TIMER registers, always available on arm64
|
||||
*/
|
||||
static const u64 timer_reg_list[] = {
|
||||
KVM_REG_ARM_TIMER_CTL,
|
||||
KVM_REG_ARM_TIMER_CNT,
|
||||
KVM_REG_ARM_TIMER_CVAL,
|
||||
KVM_REG_ARM_PTIMER_CTL,
|
||||
KVM_REG_ARM_PTIMER_CNT,
|
||||
KVM_REG_ARM_PTIMER_CVAL,
|
||||
};
|
||||
|
||||
#define NUM_TIMER_REGS 3
|
||||
#define NUM_TIMER_REGS ARRAY_SIZE(timer_reg_list)
|
||||
|
||||
static bool is_timer_reg(u64 index)
|
||||
{
|
||||
@@ -602,6 +607,9 @@ static bool is_timer_reg(u64 index)
|
||||
case KVM_REG_ARM_TIMER_CTL:
|
||||
case KVM_REG_ARM_TIMER_CNT:
|
||||
case KVM_REG_ARM_TIMER_CVAL:
|
||||
case KVM_REG_ARM_PTIMER_CTL:
|
||||
case KVM_REG_ARM_PTIMER_CNT:
|
||||
case KVM_REG_ARM_PTIMER_CVAL:
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
@@ -609,14 +617,11 @@ static bool is_timer_reg(u64 index)
|
||||
|
||||
static int copy_timer_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
|
||||
{
|
||||
if (put_user(KVM_REG_ARM_TIMER_CTL, uindices))
|
||||
return -EFAULT;
|
||||
uindices++;
|
||||
if (put_user(KVM_REG_ARM_TIMER_CNT, uindices))
|
||||
return -EFAULT;
|
||||
uindices++;
|
||||
if (put_user(KVM_REG_ARM_TIMER_CVAL, uindices))
|
||||
return -EFAULT;
|
||||
for (int i = 0; i < NUM_TIMER_REGS; i++) {
|
||||
if (put_user(timer_reg_list[i], uindices))
|
||||
return -EFAULT;
|
||||
uindices++;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
@@ -957,7 +962,9 @@ int kvm_arm_vcpu_arch_set_attr(struct kvm_vcpu *vcpu,
|
||||
|
||||
switch (attr->group) {
|
||||
case KVM_ARM_VCPU_PMU_V3_CTRL:
|
||||
mutex_lock(&vcpu->kvm->arch.config_lock);
|
||||
ret = kvm_arm_pmu_v3_set_attr(vcpu, attr);
|
||||
mutex_unlock(&vcpu->kvm->arch.config_lock);
|
||||
break;
|
||||
case KVM_ARM_VCPU_TIMER_CTRL:
|
||||
ret = kvm_arm_timer_set_attr(vcpu, attr);
|
||||
@@ -1019,8 +1026,8 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
|
||||
return ret;
|
||||
}
|
||||
|
||||
long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
|
||||
struct kvm_arm_copy_mte_tags *copy_tags)
|
||||
int kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
|
||||
struct kvm_arm_copy_mte_tags *copy_tags)
|
||||
{
|
||||
gpa_t guest_ipa = copy_tags->guest_ipa;
|
||||
size_t length = copy_tags->length;
|
||||
@@ -1041,6 +1048,10 @@ long kvm_vm_ioctl_mte_copy_tags(struct kvm *kvm,
|
||||
if (length & ~PAGE_MASK || guest_ipa & ~PAGE_MASK)
|
||||
return -EINVAL;
|
||||
|
||||
/* Lengths above INT_MAX cannot be represented in the return value */
|
||||
if (length > INT_MAX)
|
||||
return -EINVAL;
|
||||
|
||||
gfn = gpa_to_gfn(guest_ipa);
|
||||
|
||||
mutex_lock(&kvm->slots_lock);
|
||||
|
||||
@@ -36,8 +36,6 @@ static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
|
||||
|
||||
static int handle_hvc(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
int ret;
|
||||
|
||||
trace_kvm_hvc_arm64(*vcpu_pc(vcpu), vcpu_get_reg(vcpu, 0),
|
||||
kvm_vcpu_hvc_get_imm(vcpu));
|
||||
vcpu->stat.hvc_exit_stat++;
|
||||
@@ -52,33 +50,29 @@ static int handle_hvc(struct kvm_vcpu *vcpu)
|
||||
return 1;
|
||||
}
|
||||
|
||||
ret = kvm_hvc_call_handler(vcpu);
|
||||
if (ret < 0) {
|
||||
vcpu_set_reg(vcpu, 0, ~0UL);
|
||||
return 1;
|
||||
}
|
||||
|
||||
return ret;
|
||||
return kvm_smccc_call_handler(vcpu);
|
||||
}
|
||||
|
||||
static int handle_smc(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
int ret;
|
||||
|
||||
/*
|
||||
* "If an SMC instruction executed at Non-secure EL1 is
|
||||
* trapped to EL2 because HCR_EL2.TSC is 1, the exception is a
|
||||
* Trap exception, not a Secure Monitor Call exception [...]"
|
||||
*
|
||||
* We need to advance the PC after the trap, as it would
|
||||
* otherwise return to the same address...
|
||||
*
|
||||
* Only handle SMCs from the virtual EL2 with an immediate of zero and
|
||||
* skip it otherwise.
|
||||
* otherwise return to the same address. Furthermore, pre-incrementing
|
||||
* the PC before potentially exiting to userspace maintains the same
|
||||
* abstraction for both SMCs and HVCs.
|
||||
*/
|
||||
if (!vcpu_is_el2(vcpu) || kvm_vcpu_hvc_get_imm(vcpu)) {
|
||||
kvm_incr_pc(vcpu);
|
||||
|
||||
/*
|
||||
* SMCs with a nonzero immediate are reserved according to DEN0028E 2.9
|
||||
* "SMC and HVC immediate value".
|
||||
*/
|
||||
if (kvm_vcpu_hvc_get_imm(vcpu)) {
|
||||
vcpu_set_reg(vcpu, 0, ~0UL);
|
||||
kvm_incr_pc(vcpu);
|
||||
return 1;
|
||||
}
|
||||
|
||||
@@ -89,13 +83,7 @@ static int handle_smc(struct kvm_vcpu *vcpu)
|
||||
* at Non-secure EL1 is trapped to EL2 if HCR_EL2.TSC==1, rather than
|
||||
* being treated as UNDEFINED.
|
||||
*/
|
||||
ret = kvm_hvc_call_handler(vcpu);
|
||||
if (ret < 0)
|
||||
vcpu_set_reg(vcpu, 0, ~0UL);
|
||||
|
||||
kvm_incr_pc(vcpu);
|
||||
|
||||
return ret;
|
||||
return kvm_smccc_call_handler(vcpu);
|
||||
}
|
||||
|
||||
/*
|
||||
|
||||
@@ -26,6 +26,7 @@
|
||||
#include <asm/kvm_emulate.h>
|
||||
#include <asm/kvm_hyp.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
#include <asm/kvm_nested.h>
|
||||
#include <asm/fpsimd.h>
|
||||
#include <asm/debug-monitors.h>
|
||||
#include <asm/processor.h>
|
||||
@@ -326,6 +327,55 @@ static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool kvm_hyp_handle_cntpct(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
struct arch_timer_context *ctxt;
|
||||
u32 sysreg;
|
||||
u64 val;
|
||||
|
||||
/*
|
||||
* We only get here for 64bit guests, 32bit guests will hit
|
||||
* the long and winding road all the way to the standard
|
||||
* handling. Yes, it sucks to be irrelevant.
|
||||
*/
|
||||
sysreg = esr_sys64_to_sysreg(kvm_vcpu_get_esr(vcpu));
|
||||
|
||||
switch (sysreg) {
|
||||
case SYS_CNTPCT_EL0:
|
||||
case SYS_CNTPCTSS_EL0:
|
||||
if (vcpu_has_nv(vcpu)) {
|
||||
if (is_hyp_ctxt(vcpu)) {
|
||||
ctxt = vcpu_hptimer(vcpu);
|
||||
break;
|
||||
}
|
||||
|
||||
/* Check for guest hypervisor trapping */
|
||||
val = __vcpu_sys_reg(vcpu, CNTHCTL_EL2);
|
||||
if (!vcpu_el2_e2h_is_set(vcpu))
|
||||
val = (val & CNTHCTL_EL1PCTEN) << 10;
|
||||
|
||||
if (!(val & (CNTHCTL_EL1PCTEN << 10)))
|
||||
return false;
|
||||
}
|
||||
|
||||
ctxt = vcpu_ptimer(vcpu);
|
||||
break;
|
||||
default:
|
||||
return false;
|
||||
}
|
||||
|
||||
val = arch_timer_read_cntpct_el0();
|
||||
|
||||
if (ctxt->offset.vm_offset)
|
||||
val -= *kern_hyp_va(ctxt->offset.vm_offset);
|
||||
if (ctxt->offset.vcpu_offset)
|
||||
val -= *kern_hyp_va(ctxt->offset.vcpu_offset);
|
||||
|
||||
vcpu_set_reg(vcpu, kvm_vcpu_sys_get_rt(vcpu), val);
|
||||
__kvm_skip_instr(vcpu);
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||
{
|
||||
if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
|
||||
@@ -339,6 +389,9 @@ static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
|
||||
if (esr_is_ptrauth_trap(kvm_vcpu_get_esr(vcpu)))
|
||||
return kvm_hyp_handle_ptrauth(vcpu, exit_code);
|
||||
|
||||
if (kvm_hyp_handle_cntpct(vcpu))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
@@ -37,7 +37,6 @@ static void __debug_save_spe(u64 *pmscr_el1)
|
||||
|
||||
/* Now drain all buffered data to memory */
|
||||
psb_csync();
|
||||
dsb(nsh);
|
||||
}
|
||||
|
||||
static void __debug_restore_spe(u64 pmscr_el1)
|
||||
@@ -69,7 +68,6 @@ static void __debug_save_trace(u64 *trfcr_el1)
|
||||
isb();
|
||||
/* Drain the trace buffer to memory */
|
||||
tsb_csync();
|
||||
dsb(nsh);
|
||||
}
|
||||
|
||||
static void __debug_restore_trace(u64 trfcr_el1)
|
||||
|
||||
@@ -297,6 +297,13 @@ int __pkvm_prot_finalize(void)
|
||||
params->vttbr = kvm_get_vttbr(mmu);
|
||||
params->vtcr = host_mmu.arch.vtcr;
|
||||
params->hcr_el2 |= HCR_VM;
|
||||
|
||||
/*
|
||||
* The CMO below not only cleans the updated params to the
|
||||
* PoC, but also provides the DSB that ensures ongoing
|
||||
* page-table walks that have started before we trapped to EL2
|
||||
* have completed.
|
||||
*/
|
||||
kvm_flush_dcache_to_poc(params, sizeof(*params));
|
||||
|
||||
write_sysreg(params->hcr_el2, hcr_el2);
|
||||
|
||||
@@ -272,6 +272,17 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
__debug_save_host_buffers_nvhe(vcpu);
|
||||
|
||||
/*
|
||||
* We're about to restore some new MMU state. Make sure
|
||||
* ongoing page-table walks that have started before we
|
||||
* trapped to EL2 have completed. This also synchronises the
|
||||
* above disabling of SPE and TRBE.
|
||||
*
|
||||
* See DDI0487I.a D8.1.5 "Out-of-context translation regimes",
|
||||
* rule R_LFHQG and subsequent information statements.
|
||||
*/
|
||||
dsb(nsh);
|
||||
|
||||
__kvm_adjust_pc(vcpu);
|
||||
|
||||
/*
|
||||
@@ -306,6 +317,13 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
|
||||
__timer_disable_traps(vcpu);
|
||||
__hyp_vgic_save_state(vcpu);
|
||||
|
||||
/*
|
||||
* Same thing as before the guest run: we're about to switch
|
||||
* the MMU context, so let's make sure we don't have any
|
||||
* ongoing EL1&0 translations.
|
||||
*/
|
||||
dsb(nsh);
|
||||
|
||||
__deactivate_traps(vcpu);
|
||||
__load_host_stage2();
|
||||
|
||||
|
||||
@@ -9,6 +9,7 @@
|
||||
#include <linux/kvm_host.h>
|
||||
|
||||
#include <asm/kvm_hyp.h>
|
||||
#include <asm/kvm_mmu.h>
|
||||
|
||||
void __kvm_timer_set_cntvoff(u64 cntvoff)
|
||||
{
|
||||
@@ -35,14 +36,19 @@ void __timer_disable_traps(struct kvm_vcpu *vcpu)
|
||||
*/
|
||||
void __timer_enable_traps(struct kvm_vcpu *vcpu)
|
||||
{
|
||||
u64 val;
|
||||
u64 clr = 0, set = 0;
|
||||
|
||||
/*
|
||||
* Disallow physical timer access for the guest
|
||||
* Physical counter access is allowed
|
||||
* Physical counter access is allowed if no offset is enforced
|
||||
* or running protected (we don't offset anything in this case).
|
||||
*/
|
||||
val = read_sysreg(cnthctl_el2);
|
||||
val &= ~CNTHCTL_EL1PCEN;
|
||||
val |= CNTHCTL_EL1PCTEN;
|
||||
write_sysreg(val, cnthctl_el2);
|
||||
clr = CNTHCTL_EL1PCEN;
|
||||
if (is_protected_kvm_enabled() ||
|
||||
!kern_hyp_va(vcpu->kvm)->arch.timer_data.poffset)
|
||||
set |= CNTHCTL_EL1PCTEN;
|
||||
else
|
||||
clr |= CNTHCTL_EL1PCTEN;
|
||||
|
||||
sysreg_clear_set(cnthctl_el2, clr, set);
|
||||
}
|
||||
|
||||
@@ -15,8 +15,31 @@ struct tlb_inv_context {
|
||||
};
|
||||
|
||||
static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
|
||||
struct tlb_inv_context *cxt)
|
||||
struct tlb_inv_context *cxt,
|
||||
bool nsh)
|
||||
{
|
||||
/*
|
||||
* We have two requirements:
|
||||
*
|
||||
* - ensure that the page table updates are visible to all
|
||||
* CPUs, for which a dsb(DOMAIN-st) is what we need, DOMAIN
|
||||
* being either ish or nsh, depending on the invalidation
|
||||
* type.
|
||||
*
|
||||
* - complete any speculative page table walk started before
|
||||
* we trapped to EL2 so that we can mess with the MM
|
||||
* registers out of context, for which dsb(nsh) is enough
|
||||
*
|
||||
* The composition of these two barriers is a dsb(DOMAIN), and
|
||||
* the 'nsh' parameter tracks the distinction between
|
||||
* Inner-Shareable and Non-Shareable, as specified by the
|
||||
* callers.
|
||||
*/
|
||||
if (nsh)
|
||||
dsb(nsh);
|
||||
else
|
||||
dsb(ish);
|
||||
|
||||
if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
|
||||
u64 val;
|
||||
|
||||
@@ -60,10 +83,8 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
|
||||
{
|
||||
struct tlb_inv_context cxt;
|
||||
|
||||
dsb(ishst);
|
||||
|
||||
/* Switch to requested VMID */
|
||||
__tlb_switch_to_guest(mmu, &cxt);
|
||||
__tlb_switch_to_guest(mmu, &cxt, false);
|
||||
|
||||
/*
|
||||
* We could do so much better if we had the VA as well.
|
||||
@@ -113,10 +134,8 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
|
||||
{
|
||||
struct tlb_inv_context cxt;
|
||||
|
||||
dsb(ishst);
|
||||
|
||||
/* Switch to requested VMID */
|
||||
__tlb_switch_to_guest(mmu, &cxt);
|
||||
__tlb_switch_to_guest(mmu, &cxt, false);
|
||||
|
||||
__tlbi(vmalls12e1is);
|
||||
dsb(ish);
|
||||
@@ -130,7 +149,7 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
|
||||
struct tlb_inv_context cxt;
|
||||
|
||||
/* Switch to requested VMID */
|
||||
__tlb_switch_to_guest(mmu, &cxt);
|
||||
__tlb_switch_to_guest(mmu, &cxt, false);
|
||||
|
||||
__tlbi(vmalle1);
|
||||
asm volatile("ic iallu");
|
||||
@@ -142,7 +161,8 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
|
||||
|
||||
void __kvm_flush_vm_context(void)
|
||||
{
|
||||
dsb(ishst);
|
||||
/* Same remark as in __tlb_switch_to_guest() */
|
||||
dsb(ish);
|
||||
__tlbi(alle1is);
|
||||
|
||||
/*
|
||||
|
||||
@@ -227,11 +227,10 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
|
||||
|
||||
/*
|
||||
* When we exit from the guest we change a number of CPU configuration
|
||||
* parameters, such as traps. Make sure these changes take effect
|
||||
* before running the host or additional guests.
|
||||
* parameters, such as traps. We rely on the isb() in kvm_call_hyp*()
|
||||
* to make sure these changes take effect before running the host or
|
||||
* additional guests.
|
||||
*/
|
||||
isb();
|
||||
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
#include <asm/kvm_asm.h>
|
||||
#include <asm/kvm_emulate.h>
|
||||
#include <asm/kvm_hyp.h>
|
||||
#include <asm/kvm_nested.h>
|
||||
|
||||
/*
|
||||
* VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and
|
||||
@@ -69,6 +70,17 @@ void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu)
|
||||
host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
|
||||
__sysreg_save_user_state(host_ctxt);
|
||||
|
||||
/*
|
||||
* When running a normal EL1 guest, we only load a new vcpu
|
||||
* after a context switch, which imvolves a DSB, so all
|
||||
* speculative EL1&0 walks will have already completed.
|
||||
* If running NV, the vcpu may transition between vEL1 and
|
||||
* vEL2 without a context switch, so make sure we complete
|
||||
* those walks before loading a new context.
|
||||
*/
|
||||
if (vcpu_has_nv(vcpu))
|
||||
dsb(nsh);
|
||||
|
||||
/*
|
||||
* Load guest EL1 and user state
|
||||
*
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user