Merge tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Gleb Natapov:
 "Highlights of the updates are:

  general:
   - new emulated device API
   - legacy device assignment is now optional
   - irqfd interface is more generic and can be shared between arches

  x86:
   - VMCS shadow support and other nested VMX improvements
   - APIC virtualization and Posted Interrupt hardware support
   - Optimize mmio spte zapping

  ppc:
    - BookE: in-kernel MPIC emulation with irqfd support
    - Book3S: in-kernel XICS emulation (incomplete)
    - Book3S: HV: migration fixes
    - BookE: more debug support preparation
    - BookE: e6500 support

  ARM:
   - reworking of Hyp idmaps

  s390:
   - ioeventfd for virtio-ccw

  And many other bug fixes, cleanups and improvements"

* tag 'kvm-3.10-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (204 commits)
  kvm: Add compat_ioctl for device control API
  KVM: x86: Account for failing enable_irq_window for NMI window request
  KVM: PPC: Book3S: Add API for in-kernel XICS emulation
  kvm/ppc/mpic: fix missing unlock in set_base_addr()
  kvm/ppc: Hold srcu lock when calling kvm_io_bus_read/write
  kvm/ppc/mpic: remove users
  kvm/ppc/mpic: fix mmio region lists when multiple guests used
  kvm/ppc/mpic: remove default routes from documentation
  kvm: KVM_CAP_IOMMU only available with device assignment
  ARM: KVM: iterate over all CPUs for CPU compatibility check
  KVM: ARM: Fix spelling in error message
  ARM: KVM: define KVM_ARM_MAX_VCPUS unconditionally
  KVM: ARM: Fix API documentation for ONE_REG encoding
  ARM: KVM: promote vfp_host pointer to generic host cpu context
  ARM: KVM: add architecture specific hook for capabilities
  ARM: KVM: perform HYP initilization for hotplugged CPUs
  ARM: KVM: switch to a dual-step HYP init code
  ARM: KVM: rework HYP page table freeing
  ARM: KVM: enforce maximum size for identity mapped code
  ARM: KVM: move to a KVM provided HYP idmap
  ...
This commit is contained in:
Linus Torvalds
2013-05-05 14:47:31 -07:00
110 changed files with 8607 additions and 2355 deletions
+140 -6
View File
@@ -1486,15 +1486,23 @@ struct kvm_ioeventfd {
__u8 pad[36];
};
For the special case of virtio-ccw devices on s390, the ioevent is matched
to a subchannel/virtqueue tuple instead.
The following flags are defined:
#define KVM_IOEVENTFD_FLAG_DATAMATCH (1 << kvm_ioeventfd_flag_nr_datamatch)
#define KVM_IOEVENTFD_FLAG_PIO (1 << kvm_ioeventfd_flag_nr_pio)
#define KVM_IOEVENTFD_FLAG_DEASSIGN (1 << kvm_ioeventfd_flag_nr_deassign)
#define KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY \
(1 << kvm_ioeventfd_flag_nr_virtio_ccw_notify)
If datamatch flag is set, the event will be signaled only if the written value
to the registered address is equal to datamatch in struct kvm_ioeventfd.
For virtio-ccw devices, addr contains the subchannel id and datamatch the
virtqueue index.
4.60 KVM_DIRTY_TLB
@@ -1780,27 +1788,48 @@ registers, find a list below:
PPC | KVM_REG_PPC_VPA_DTL | 128
PPC | KVM_REG_PPC_EPCR | 32
PPC | KVM_REG_PPC_EPR | 32
PPC | KVM_REG_PPC_TCR | 32
PPC | KVM_REG_PPC_TSR | 32
PPC | KVM_REG_PPC_OR_TSR | 32
PPC | KVM_REG_PPC_CLEAR_TSR | 32
PPC | KVM_REG_PPC_MAS0 | 32
PPC | KVM_REG_PPC_MAS1 | 32
PPC | KVM_REG_PPC_MAS2 | 64
PPC | KVM_REG_PPC_MAS7_3 | 64
PPC | KVM_REG_PPC_MAS4 | 32
PPC | KVM_REG_PPC_MAS6 | 32
PPC | KVM_REG_PPC_MMUCFG | 32
PPC | KVM_REG_PPC_TLB0CFG | 32
PPC | KVM_REG_PPC_TLB1CFG | 32
PPC | KVM_REG_PPC_TLB2CFG | 32
PPC | KVM_REG_PPC_TLB3CFG | 32
PPC | KVM_REG_PPC_TLB0PS | 32
PPC | KVM_REG_PPC_TLB1PS | 32
PPC | KVM_REG_PPC_TLB2PS | 32
PPC | KVM_REG_PPC_TLB3PS | 32
PPC | KVM_REG_PPC_EPTCFG | 32
PPC | KVM_REG_PPC_ICP_STATE | 64
ARM registers are mapped using the lower 32 bits. The upper 16 of that
is the register group type, or coprocessor number:
ARM core registers have the following id bit patterns:
0x4002 0000 0010 <index into the kvm_regs struct:16>
0x4020 0000 0010 <index into the kvm_regs struct:16>
ARM 32-bit CP15 registers have the following id bit patterns:
0x4002 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
0x4020 0000 000F <zero:1> <crn:4> <crm:4> <opc1:4> <opc2:3>
ARM 64-bit CP15 registers have the following id bit patterns:
0x4003 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
0x4030 0000 000F <zero:1> <zero:4> <crm:4> <opc1:4> <zero:3>
ARM CCSIDR registers are demultiplexed by CSSELR value:
0x4002 0000 0011 00 <csselr:8>
0x4020 0000 0011 00 <csselr:8>
ARM 32-bit VFP control registers have the following id bit patterns:
0x4002 0000 0012 1 <regno:12>
0x4020 0000 0012 1 <regno:12>
ARM 64-bit FP registers have the following id bit patterns:
0x4002 0000 0012 0 <regno:12>
0x4030 0000 0012 0 <regno:12>
4.69 KVM_GET_ONE_REG
@@ -2161,6 +2190,76 @@ header; first `n_valid' valid entries with contents from the data
written, then `n_invalid' invalid entries, invalidating any previously
valid entries found.
4.79 KVM_CREATE_DEVICE
Capability: KVM_CAP_DEVICE_CTRL
Type: vm ioctl
Parameters: struct kvm_create_device (in/out)
Returns: 0 on success, -1 on error
Errors:
ENODEV: The device type is unknown or unsupported
EEXIST: Device already created, and this type of device may not
be instantiated multiple times
Other error conditions may be defined by individual device types or
have their standard meanings.
Creates an emulated device in the kernel. The file descriptor returned
in fd can be used with KVM_SET/GET/HAS_DEVICE_ATTR.
If the KVM_CREATE_DEVICE_TEST flag is set, only test whether the
device type is supported (not necessarily whether it can be created
in the current vm).
Individual devices should not define flags. Attributes should be used
for specifying any behavior that is not implied by the device type
number.
struct kvm_create_device {
__u32 type; /* in: KVM_DEV_TYPE_xxx */
__u32 fd; /* out: device handle */
__u32 flags; /* in: KVM_CREATE_DEVICE_xxx */
};
4.80 KVM_SET_DEVICE_ATTR/KVM_GET_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
EPERM: The attribute cannot (currently) be accessed this way
(e.g. read-only attribute, or attribute that only makes
sense when the device is in a different state)
Other error conditions may be defined by individual device types.
Gets/sets a specified piece of device configuration and/or state. The
semantics are device-specific. See individual device documentation in
the "devices" directory. As with ONE_REG, the size of the data
transferred is defined by the particular attribute.
struct kvm_device_attr {
__u32 flags; /* no flags currently defined */
__u32 group; /* device-defined */
__u64 attr; /* group-defined */
__u64 addr; /* userspace address of attr data */
};
4.81 KVM_HAS_DEVICE_ATTR
Capability: KVM_CAP_DEVICE_CTRL
Type: device ioctl
Parameters: struct kvm_device_attr
Returns: 0 on success, -1 on error
Errors:
ENXIO: The group or attribute is unknown/unsupported for this device
Tests whether a device supports a particular attribute. A successful
return indicates the attribute is implemented. It does not necessarily
indicate that the attribute can be read or written in the device's
current state. "addr" is ignored.
4.77 KVM_ARM_VCPU_INIT
@@ -2243,6 +2342,25 @@ and distributor interface, the ioctl must be called after calling
KVM_CREATE_IRQCHIP, but before calling KVM_RUN on any of the VCPUs. Calling
this ioctl twice for any of the base addresses will return -EEXIST.
4.82 KVM_PPC_RTAS_DEFINE_TOKEN
Capability: KVM_CAP_PPC_RTAS
Architectures: ppc
Type: vm ioctl
Parameters: struct kvm_rtas_token_args
Returns: 0 on success, -1 on error
Defines a token value for a RTAS (Run Time Abstraction Services)
service in order to allow it to be handled in the kernel. The
argument struct gives the name of the service, which must be the name
of a service that has a kernel-side implementation. If the token
value is non-zero, it will be associated with that service, and
subsequent RTAS calls by the guest specifying that token will be
handled by the kernel. If the token value is 0, then any token
associated with the service will be forgotten, and subsequent RTAS
calls by the guest for that service will be passed to userspace to be
handled.
5. The kvm_run structure
------------------------
@@ -2646,3 +2764,19 @@ to receive the topmost interrupt vector.
When disabled (args[0] == 0), behavior is as if this facility is unsupported.
When this capability is enabled, KVM_EXIT_EPR can occur.
6.6 KVM_CAP_IRQ_MPIC
Architectures: ppc
Parameters: args[0] is the MPIC device fd
args[1] is the MPIC CPU number for this vcpu
This capability connects the vcpu to an in-kernel MPIC device.
6.7 KVM_CAP_IRQ_XICS
Architectures: ppc
Parameters: args[0] is the XICS device fd
args[1] is the XICS CPU number (server ID) for this vcpu
This capability connects the vcpu to an in-kernel XICS device.
+1
View File
@@ -0,0 +1 @@
This directory contains specific device bindings for KVM_CAP_DEVICE_CTRL.
@@ -0,0 +1,53 @@
MPIC interrupt controller
=========================
Device types supported:
KVM_DEV_TYPE_FSL_MPIC_20 Freescale MPIC v2.0
KVM_DEV_TYPE_FSL_MPIC_42 Freescale MPIC v4.2
Only one MPIC instance, of any type, may be instantiated. The created
MPIC will act as the system interrupt controller, connecting to each
vcpu's interrupt inputs.
Groups:
KVM_DEV_MPIC_GRP_MISC
Attributes:
KVM_DEV_MPIC_BASE_ADDR (rw, 64-bit)
Base address of the 256 KiB MPIC register space. Must be
naturally aligned. A value of zero disables the mapping.
Reset value is zero.
KVM_DEV_MPIC_GRP_REGISTER (rw, 32-bit)
Access an MPIC register, as if the access were made from the guest.
"attr" is the byte offset into the MPIC register space. Accesses
must be 4-byte aligned.
MSIs may be signaled by using this attribute group to write
to the relevant MSIIR.
KVM_DEV_MPIC_GRP_IRQ_ACTIVE (rw, 32-bit)
IRQ input line for each standard openpic source. 0 is inactive and 1
is active, regardless of interrupt sense.
For edge-triggered interrupts: Writing 1 is considered an activating
edge, and writing 0 is ignored. Reading returns 1 if a previously
signaled edge has not been acknowledged, and 0 otherwise.
"attr" is the IRQ number. IRQ numbers for standard sources are the
byte offset of the relevant IVPR from EIVPR0, divided by 32.
IRQ Routing:
The MPIC emulation supports IRQ routing. Only a single MPIC device can
be instantiated. Once that device has been created, it's available as
irqchip id 0.
This irqchip 0 has 256 interrupt pins, which expose the interrupts in
the main array of interrupt sources (a.k.a. "SRC" interrupts).
The numbering is the same as the MPIC device tree binding -- based on
the register offset from the beginning of the sources array, without
regard to any subdivisions in chip documentation such as "internal"
or "external" interrupts.
Access to non-SRC interrupts is not implemented through IRQ routing mechanisms.
@@ -0,0 +1,66 @@
XICS interrupt controller
Device type supported: KVM_DEV_TYPE_XICS
Groups:
KVM_DEV_XICS_SOURCES
Attributes: One per interrupt source, indexed by the source number.
This device emulates the XICS (eXternal Interrupt Controller
Specification) defined in PAPR. The XICS has a set of interrupt
sources, each identified by a 20-bit source number, and a set of
Interrupt Control Presentation (ICP) entities, also called "servers",
each associated with a virtual CPU.
The ICP entities are created by enabling the KVM_CAP_IRQ_ARCH
capability for each vcpu, specifying KVM_CAP_IRQ_XICS in args[0] and
the interrupt server number (i.e. the vcpu number from the XICS's
point of view) in args[1] of the kvm_enable_cap struct. Each ICP has
64 bits of state which can be read and written using the
KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctls on the vcpu. The 64 bit
state word has the following bitfields, starting at the
least-significant end of the word:
* Unused, 16 bits
* Pending interrupt priority, 8 bits
Zero is the highest priority, 255 means no interrupt is pending.
* Pending IPI (inter-processor interrupt) priority, 8 bits
Zero is the highest priority, 255 means no IPI is pending.
* Pending interrupt source number, 24 bits
Zero means no interrupt pending, 2 means an IPI is pending
* Current processor priority, 8 bits
Zero is the highest priority, meaning no interrupts can be
delivered, and 255 is the lowest priority.
Each source has 64 bits of state that can be read and written using
the KVM_GET_DEVICE_ATTR and KVM_SET_DEVICE_ATTR ioctls, specifying the
KVM_DEV_XICS_SOURCES attribute group, with the attribute number being
the interrupt source number. The 64 bit state word has the following
bitfields, starting from the least-significant end of the word:
* Destination (server number), 32 bits
This specifies where the interrupt should be sent, and is the
interrupt server number specified for the destination vcpu.
* Priority, 8 bits
This is the priority specified for this interrupt source, where 0 is
the highest priority and 255 is the lowest. An interrupt with a
priority of 255 will never be delivered.
* Level sensitive flag, 1 bit
This bit is 1 for a level-sensitive interrupt source, or 0 for
edge-sensitive (or MSI).
* Masked flag, 1 bit
This bit is set to 1 if the interrupt is masked (cannot be delivered
regardless of its priority), for example by the ibm,int-off RTAS
call, or 0 if it is not masked.
* Pending flag, 1 bit
This bit is 1 if the source has a pending interrupt, otherwise 0.
Only one XICS instance may be created per VM.
-1
View File
@@ -8,7 +8,6 @@
#define __idmap __section(.idmap.text) noinline notrace
extern pgd_t *idmap_pgd;
extern pgd_t *hyp_pgd;
void setup_mm_for_reboot(void);
+32 -15
View File
@@ -87,7 +87,7 @@ struct kvm_vcpu_fault_info {
u32 hyp_pc; /* PC when exception was taken from Hyp mode */
};
typedef struct vfp_hard_struct kvm_kernel_vfp_t;
typedef struct vfp_hard_struct kvm_cpu_context_t;
struct kvm_vcpu_arch {
struct kvm_regs regs;
@@ -105,8 +105,10 @@ struct kvm_vcpu_arch {
struct kvm_vcpu_fault_info fault;
/* Floating point registers (VFP and Advanced SIMD/NEON) */
kvm_kernel_vfp_t vfp_guest;
kvm_kernel_vfp_t *vfp_host;
struct vfp_hard_struct vfp_guest;
/* Host FP context */
kvm_cpu_context_t *host_cpu_context;
/* VGIC state */
struct vgic_cpu vgic_cpu;
@@ -188,23 +190,38 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
int exception_index);
static inline void __cpu_init_hyp_mode(unsigned long long pgd_ptr,
static inline void __cpu_init_hyp_mode(unsigned long long boot_pgd_ptr,
unsigned long long pgd_ptr,
unsigned long hyp_stack_ptr,
unsigned long vector_ptr)
{
unsigned long pgd_low, pgd_high;
pgd_low = (pgd_ptr & ((1ULL << 32) - 1));
pgd_high = (pgd_ptr >> 32ULL);
/*
* Call initialization code, and switch to the full blown
* HYP code. The init code doesn't need to preserve these registers as
* r1-r3 and r12 are already callee save according to the AAPCS.
* Note that we slightly misuse the prototype by casing the pgd_low to
* a void *.
* Call initialization code, and switch to the full blown HYP
* code. The init code doesn't need to preserve these
* registers as r0-r3 are already callee saved according to
* the AAPCS.
* Note that we slightly misuse the prototype by casing the
* stack pointer to a void *.
*
* We don't have enough registers to perform the full init in
* one go. Install the boot PGD first, and then install the
* runtime PGD, stack pointer and vectors. The PGDs are always
* passed as the third argument, in order to be passed into
* r2-r3 to the init code (yes, this is compliant with the
* PCS!).
*/
kvm_call_hyp((void *)pgd_low, pgd_high, hyp_stack_ptr, vector_ptr);
kvm_call_hyp(NULL, 0, boot_pgd_ptr);
kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
}
static inline int kvm_arch_dev_ioctl_check_extension(long ext)
{
return 0;
}
int kvm_perf_init(void);
int kvm_perf_teardown(void);
#endif /* __ARM_KVM_HOST_H__ */
+23 -5
View File
@@ -19,21 +19,33 @@
#ifndef __ARM_KVM_MMU_H__
#define __ARM_KVM_MMU_H__
#include <asm/cacheflush.h>
#include <asm/pgalloc.h>
#include <asm/idmap.h>
#include <asm/memory.h>
#include <asm/page.h>
/*
* We directly use the kernel VA for the HYP, as we can directly share
* the mapping (HTTBR "covers" TTBR1).
*/
#define HYP_PAGE_OFFSET_MASK (~0UL)
#define HYP_PAGE_OFFSET_MASK UL(~0)
#define HYP_PAGE_OFFSET PAGE_OFFSET
#define KERN_TO_HYP(kva) (kva)
/*
* Our virtual mapping for the boot-time MMU-enable code. Must be
* shared across all the page-tables. Conveniently, we use the vectors
* page, where no kernel data will ever be shared with HYP.
*/
#define TRAMPOLINE_VA UL(CONFIG_VECTORS_BASE)
#ifndef __ASSEMBLY__
#include <asm/cacheflush.h>
#include <asm/pgalloc.h>
int create_hyp_mappings(void *from, void *to);
int create_hyp_io_mappings(void *from, void *to, phys_addr_t);
void free_hyp_pmds(void);
void free_boot_hyp_pgd(void);
void free_hyp_pgds(void);
int kvm_alloc_stage2_pgd(struct kvm *kvm);
void kvm_free_stage2_pgd(struct kvm *kvm);
@@ -45,6 +57,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct kvm_run *run);
void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
phys_addr_t kvm_mmu_get_httbr(void);
phys_addr_t kvm_mmu_get_boot_httbr(void);
phys_addr_t kvm_get_idmap_vector(void);
int kvm_mmu_init(void);
void kvm_clear_hyp_idmap(void);
@@ -114,4 +128,8 @@ static inline void coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn)
}
}
#define kvm_flush_dcache_to_poc(a,l) __cpuc_flush_dcache_area((a), (l))
#endif /* !__ASSEMBLY__ */
#endif /* __ARM_KVM_MMU_H__ */
+1 -1
View File
@@ -158,7 +158,7 @@ int main(void)
DEFINE(VCPU_MIDR, offsetof(struct kvm_vcpu, arch.midr));
DEFINE(VCPU_CP15, offsetof(struct kvm_vcpu, arch.cp15));
DEFINE(VCPU_VFP_GUEST, offsetof(struct kvm_vcpu, arch.vfp_guest));
DEFINE(VCPU_VFP_HOST, offsetof(struct kvm_vcpu, arch.vfp_host));
DEFINE(VCPU_VFP_HOST, offsetof(struct kvm_vcpu, arch.host_cpu_context));
DEFINE(VCPU_REGS, offsetof(struct kvm_vcpu, arch.regs));
DEFINE(VCPU_USR_REGS, offsetof(struct kvm_vcpu, arch.regs.usr_regs));
DEFINE(VCPU_SVC_REGS, offsetof(struct kvm_vcpu, arch.regs.svc_regs));
+6 -1
View File
@@ -20,7 +20,7 @@
VMLINUX_SYMBOL(__idmap_text_start) = .; \
*(.idmap.text) \
VMLINUX_SYMBOL(__idmap_text_end) = .; \
ALIGN_FUNCTION(); \
. = ALIGN(32); \
VMLINUX_SYMBOL(__hyp_idmap_text_start) = .; \
*(.hyp.idmap.text) \
VMLINUX_SYMBOL(__hyp_idmap_text_end) = .;
@@ -315,3 +315,8 @@ SECTIONS
*/
ASSERT((__proc_info_end - __proc_info_begin), "missing CPU support")
ASSERT((__arch_info_end - __arch_info_begin), "no machine record defined")
/*
* The HYP init code can't be more than a page long.
* The above comment applies as well.
*/
ASSERT(((__hyp_idmap_text_end - __hyp_idmap_text_start) <= PAGE_SIZE), "HYP init code too big")
+3 -3
View File
@@ -41,9 +41,9 @@ config KVM_ARM_HOST
Provides host support for ARM processors.
config KVM_ARM_MAX_VCPUS
int "Number maximum supported virtual CPUs per VM"
depends on KVM_ARM_HOST
default 4
int "Number maximum supported virtual CPUs per VM" if KVM_ARM_HOST
default 4 if KVM_ARM_HOST
default 0
help
Static number of max supported virtual CPUs per VM.
+1 -1
View File
@@ -18,6 +18,6 @@ kvm-arm-y = $(addprefix ../../../virt/kvm/, kvm_main.o coalesced_mmio.o)
obj-y += kvm-arm.o init.o interrupts.o
obj-y += arm.o handle_exit.o guest.o mmu.o emulate.o reset.o
obj-y += coproc.o coproc_a15.o mmio.o psci.o
obj-y += coproc.o coproc_a15.o mmio.o psci.o perf.o
obj-$(CONFIG_KVM_ARM_VGIC) += vgic.o
obj-$(CONFIG_KVM_ARM_TIMER) += arch_timer.o
+4 -3
View File
@@ -22,6 +22,7 @@
#include <linux/kvm_host.h>
#include <linux/interrupt.h>
#include <clocksource/arm_arch_timer.h>
#include <asm/arch_timer.h>
#include <asm/kvm_vgic.h>
@@ -64,7 +65,7 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
{
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
timer->cntv_ctl |= 1 << 1; /* Mask the interrupt in the guest */
timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
vcpu->arch.timer_cpu.irq->irq,
vcpu->arch.timer_cpu.irq->level);
@@ -133,8 +134,8 @@ void kvm_timer_sync_hwstate(struct kvm_vcpu *vcpu)
cycle_t cval, now;
u64 ns;
/* Check if the timer is enabled and unmasked first */
if ((timer->cntv_ctl & 3) != 1)
if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
!(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
return;
cval = timer->cntv_cval;
+75 -54
View File
@@ -16,6 +16,7 @@
* Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
*/
#include <linux/cpu.h>
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/kvm_host.h>
@@ -48,7 +49,7 @@ __asm__(".arch_extension virt");
#endif
static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
static kvm_kernel_vfp_t __percpu *kvm_host_vfp_state;
static kvm_cpu_context_t __percpu *kvm_host_cpu_state;
static unsigned long hyp_default_vectors;
/* Per-CPU variable containing the currently running vcpu. */
@@ -206,7 +207,7 @@ int kvm_dev_ioctl_check_extension(long ext)
r = KVM_MAX_VCPUS;
break;
default:
r = 0;
r = kvm_arch_dev_ioctl_check_extension(ext);
break;
}
return r;
@@ -218,27 +219,18 @@ long kvm_arch_dev_ioctl(struct file *filp,
return -EINVAL;
}
int kvm_arch_set_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
int user_alloc)
{
return 0;
}
int kvm_arch_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *memslot,
struct kvm_memory_slot old,
struct kvm_userspace_memory_region *mem,
bool user_alloc)
enum kvm_mr_change change)
{
return 0;
}
void kvm_arch_commit_memory_region(struct kvm *kvm,
struct kvm_userspace_memory_region *mem,
struct kvm_memory_slot old,
bool user_alloc)
const struct kvm_memory_slot *old,
enum kvm_mr_change change)
{
}
@@ -326,7 +318,7 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
vcpu->cpu = cpu;
vcpu->arch.vfp_host = this_cpu_ptr(kvm_host_vfp_state);
vcpu->arch.host_cpu_context = this_cpu_ptr(kvm_host_cpu_state);
/*
* Check whether this vcpu requires the cache to be flushed on
@@ -639,7 +631,8 @@ static int vcpu_interrupt_line(struct kvm_vcpu *vcpu, int number, bool level)
return 0;
}
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level,
bool line_status)
{
u32 irq = irq_level->irq;
unsigned int irq_type, vcpu_idx, irq_num;
@@ -794,30 +787,48 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
}
static void cpu_init_hyp_mode(void *vector)
static void cpu_init_hyp_mode(void *dummy)
{
unsigned long long boot_pgd_ptr;
unsigned long long pgd_ptr;
unsigned long hyp_stack_ptr;
unsigned long stack_page;
unsigned long vector_ptr;
/* Switch from the HYP stub to our own HYP init vector */
__hyp_set_vectors((unsigned long)vector);
__hyp_set_vectors(kvm_get_idmap_vector());
boot_pgd_ptr = (unsigned long long)kvm_mmu_get_boot_httbr();
pgd_ptr = (unsigned long long)kvm_mmu_get_httbr();
stack_page = __get_cpu_var(kvm_arm_hyp_stack_page);
hyp_stack_ptr = stack_page + PAGE_SIZE;
vector_ptr = (unsigned long)__kvm_hyp_vector;
__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
}
static int hyp_init_cpu_notify(struct notifier_block *self,
unsigned long action, void *cpu)
{
switch (action) {
case CPU_STARTING:
case CPU_STARTING_FROZEN:
cpu_init_hyp_mode(NULL);
break;
}
return NOTIFY_OK;
}
static struct notifier_block hyp_init_cpu_nb = {
.notifier_call = hyp_init_cpu_notify,
};
/**
* Inits Hyp-mode on all online CPUs
*/
static int init_hyp_mode(void)
{
phys_addr_t init_phys_addr;
int cpu;
int err = 0;
@@ -849,24 +860,6 @@ static int init_hyp_mode(void)
per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
}
/*
* Execute the init code on each CPU.
*
* Note: The stack is not mapped yet, so don't do anything else than
* initializing the hypervisor mode on each CPU using a local stack
* space for temporary storage.
*/
init_phys_addr = virt_to_phys(__kvm_hyp_init);
for_each_online_cpu(cpu) {
smp_call_function_single(cpu, cpu_init_hyp_mode,
(void *)(long)init_phys_addr, 1);
}
/*
* Unmap the identity mapping
*/
kvm_clear_hyp_idmap();
/*
* Map the Hyp-code called directly from the host
*/
@@ -890,33 +883,38 @@ static int init_hyp_mode(void)
}
/*
* Map the host VFP structures
* Map the host CPU structures
*/
kvm_host_vfp_state = alloc_percpu(kvm_kernel_vfp_t);
if (!kvm_host_vfp_state) {
kvm_host_cpu_state = alloc_percpu(kvm_cpu_context_t);
if (!kvm_host_cpu_state) {
err = -ENOMEM;
kvm_err("Cannot allocate host VFP state\n");
kvm_err("Cannot allocate host CPU state\n");
goto out_free_mappings;
}
for_each_possible_cpu(cpu) {
kvm_kernel_vfp_t *vfp;
kvm_cpu_context_t *cpu_ctxt;
vfp = per_cpu_ptr(kvm_host_vfp_state, cpu);
err = create_hyp_mappings(vfp, vfp + 1);
cpu_ctxt = per_cpu_ptr(kvm_host_cpu_state, cpu);
err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1);
if (err) {
kvm_err("Cannot map host VFP state: %d\n", err);
goto out_free_vfp;
kvm_err("Cannot map host CPU state: %d\n", err);
goto out_free_context;
}
}
/*
* Execute the init code on each CPU.
*/
on_each_cpu(cpu_init_hyp_mode, NULL, 1);
/*
* Init HYP view of VGIC
*/
err = kvm_vgic_hyp_init();
if (err)
goto out_free_vfp;
goto out_free_context;
#ifdef CONFIG_KVM_ARM_VGIC
vgic_present = true;
@@ -929,12 +927,19 @@ static int init_hyp_mode(void)
if (err)
goto out_free_mappings;
#ifndef CONFIG_HOTPLUG_CPU
free_boot_hyp_pgd();
#endif
kvm_perf_init();
kvm_info("Hyp mode initialized successfully\n");
return 0;
out_free_vfp:
free_percpu(kvm_host_vfp_state);
out_free_context:
free_percpu(kvm_host_cpu_state);
out_free_mappings:
free_hyp_pmds();
free_hyp_pgds();
out_free_stack_pages:
for_each_possible_cpu(cpu)
free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
@@ -943,27 +948,42 @@ out_err:
return err;
}
static void check_kvm_target_cpu(void *ret)
{
*(int *)ret = kvm_target_cpu();
}
/**
* Initialize Hyp-mode and memory mappings on all CPUs.
*/
int kvm_arch_init(void *opaque)
{
int err;
int ret, cpu;
if (!is_hyp_mode_available()) {
kvm_err("HYP mode not available\n");
return -ENODEV;
}
if (kvm_target_cpu() < 0) {
kvm_err("Target CPU not supported!\n");
return -ENODEV;
for_each_online_cpu(cpu) {
smp_call_function_single(cpu, check_kvm_target_cpu, &ret, 1);
if (ret < 0) {
kvm_err("Error, CPU %d not supported!\n", cpu);
return -ENODEV;
}
}
err = init_hyp_mode();
if (err)
goto out_err;
err = register_cpu_notifier(&hyp_init_cpu_nb);
if (err) {
kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
goto out_err;
}
kvm_coproc_table_init();
return 0;
out_err:
@@ -973,6 +993,7 @@ out_err:
/* NOP: Compiling as a module not supported */
void kvm_arch_exit(void)
{
kvm_perf_teardown();
}
static int arm_init(void)
+59 -19
View File
@@ -21,13 +21,33 @@
#include <asm/asm-offsets.h>
#include <asm/kvm_asm.h>
#include <asm/kvm_arm.h>
#include <asm/kvm_mmu.h>
/********************************************************************
* Hypervisor initialization
* - should be called with:
* r0,r1 = Hypervisor pgd pointer
* r2 = top of Hyp stack (kernel VA)
* r3 = pointer to hyp vectors
* r0 = top of Hyp stack (kernel VA)
* r1 = pointer to hyp vectors
* r2,r3 = Hypervisor pgd pointer
*
* The init scenario is:
* - We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
* runtime stack, runtime vectors
* - Enable the MMU with the boot pgd
* - Jump to a target into the trampoline page (remember, this is the same
* physical page!)
* - Now switch to the runtime pgd (same VA, and still the same physical
* page!)
* - Invalidate TLBs
* - Set stack and vectors
* - Profit! (or eret, if you only care about the code).
*
* As we only have four registers available to pass parameters (and we
* need six), we split the init in two phases:
* - Phase 1: r0 = 0, r1 = 0, r2,r3 contain the boot PGD.
* Provides the basic HYP init, and enable the MMU.
* - Phase 2: r0 = ToS, r1 = vectors, r2,r3 contain the runtime PGD.
* Switches to the runtime PGD, set stack and vectors.
*/
.text
@@ -47,22 +67,25 @@ __kvm_hyp_init:
W(b) .
__do_hyp_init:
cmp r0, #0 @ We have a SP?
bne phase2 @ Yes, second stage init
@ Set the HTTBR to point to the hypervisor PGD pointer passed
mcrr p15, 4, r0, r1, c2
mcrr p15, 4, r2, r3, c2
@ Set the HTCR and VTCR to the same shareability and cacheability
@ settings as the non-secure TTBCR and with T0SZ == 0.
mrc p15, 4, r0, c2, c0, 2 @ HTCR
ldr r12, =HTCR_MASK
bic r0, r0, r12
ldr r2, =HTCR_MASK
bic r0, r0, r2
mrc p15, 0, r1, c2, c0, 2 @ TTBCR
and r1, r1, #(HTCR_MASK & ~TTBCR_T0SZ)
orr r0, r0, r1
mcr p15, 4, r0, c2, c0, 2 @ HTCR
mrc p15, 4, r1, c2, c1, 2 @ VTCR
ldr r12, =VTCR_MASK
bic r1, r1, r12
ldr r2, =VTCR_MASK
bic r1, r1, r2
bic r0, r0, #(~VTCR_HTCR_SH) @ clear non-reusable HTCR bits
orr r1, r0, r1
orr r1, r1, #(KVM_VTCR_SL0 | KVM_VTCR_T0SZ | KVM_VTCR_S)
@@ -85,24 +108,41 @@ __do_hyp_init:
@ - Memory alignment checks: enabled
@ - MMU: enabled (this code must be run from an identity mapping)
mrc p15, 4, r0, c1, c0, 0 @ HSCR
ldr r12, =HSCTLR_MASK
bic r0, r0, r12
ldr r2, =HSCTLR_MASK
bic r0, r0, r2
mrc p15, 0, r1, c1, c0, 0 @ SCTLR
ldr r12, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
and r1, r1, r12
ARM( ldr r12, =(HSCTLR_M | HSCTLR_A) )
THUMB( ldr r12, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
orr r1, r1, r12
ldr r2, =(HSCTLR_EE | HSCTLR_FI | HSCTLR_I | HSCTLR_C)
and r1, r1, r2
ARM( ldr r2, =(HSCTLR_M | HSCTLR_A) )
THUMB( ldr r2, =(HSCTLR_M | HSCTLR_A | HSCTLR_TE) )
orr r1, r1, r2
orr r0, r0, r1
isb
mcr p15, 4, r0, c1, c0, 0 @ HSCR
isb
@ Set stack pointer and return to the kernel
mov sp, r2
@ End of init phase-1
eret
phase2:
@ Set stack pointer
mov sp, r0
@ Set HVBAR to point to the HYP vectors
mcr p15, 4, r3, c12, c0, 0 @ HVBAR
mcr p15, 4, r1, c12, c0, 0 @ HVBAR
@ Jump to the trampoline page
ldr r0, =TRAMPOLINE_VA
adr r1, target
bfi r0, r1, #0, #PAGE_SHIFT
mov pc, r0
target: @ We're now in the trampoline code, switch page tables
mcrr p15, 4, r2, r3, c2
isb
@ Invalidate the old TLBs
mcr p15, 4, r0, c8, c7, 0 @ TLBIALLH
dsb
eret
+343 -284
View File
File diff suppressed because it is too large Load Diff
+68
View File
@@ -0,0 +1,68 @@
/*
* Based on the x86 implementation.
*
* Copyright (C) 2012 ARM Ltd.
* Author: Marc Zyngier <marc.zyngier@arm.com>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <linux/perf_event.h>
#include <linux/kvm_host.h>
#include <asm/kvm_emulate.h>
static int kvm_is_in_guest(void)
{
return kvm_arm_get_running_vcpu() != NULL;
}
static int kvm_is_user_mode(void)
{
struct kvm_vcpu *vcpu;
vcpu = kvm_arm_get_running_vcpu();
if (vcpu)
return !vcpu_mode_priv(vcpu);
return 0;
}
static unsigned long kvm_get_guest_ip(void)
{
struct kvm_vcpu *vcpu;
vcpu = kvm_arm_get_running_vcpu();
if (vcpu)
return *vcpu_pc(vcpu);
return 0;
}
static struct perf_guest_info_callbacks kvm_guest_cbs = {
.is_in_guest = kvm_is_in_guest,
.is_user_mode = kvm_is_user_mode,
.get_guest_ip = kvm_get_guest_ip,
};
int kvm_perf_init(void)
{
return perf_register_guest_info_callbacks(&kvm_guest_cbs);
}
int kvm_perf_teardown(void)
{
return perf_unregister_guest_info_callbacks(&kvm_guest_cbs);
}
+1 -31
View File
@@ -8,7 +8,6 @@
#include <asm/pgtable.h>
#include <asm/sections.h>
#include <asm/system_info.h>
#include <asm/virt.h>
pgd_t *idmap_pgd;
@@ -83,37 +82,10 @@ static void identity_mapping_add(pgd_t *pgd, const char *text_start,
} while (pgd++, addr = next, addr != end);
}
#if defined(CONFIG_ARM_VIRT_EXT) && defined(CONFIG_ARM_LPAE)
pgd_t *hyp_pgd;
extern char __hyp_idmap_text_start[], __hyp_idmap_text_end[];
static int __init init_static_idmap_hyp(void)
{
hyp_pgd = kzalloc(PTRS_PER_PGD * sizeof(pgd_t), GFP_KERNEL);
if (!hyp_pgd)
return -ENOMEM;
pr_info("Setting up static HYP identity map for 0x%p - 0x%p\n",
__hyp_idmap_text_start, __hyp_idmap_text_end);
identity_mapping_add(hyp_pgd, __hyp_idmap_text_start,
__hyp_idmap_text_end, PMD_SECT_AP1);
return 0;
}
#else
static int __init init_static_idmap_hyp(void)
{
return 0;
}
#endif
extern char __idmap_text_start[], __idmap_text_end[];
static int __init init_static_idmap(void)
{
int ret;
idmap_pgd = pgd_alloc(&init_mm);
if (!idmap_pgd)
return -ENOMEM;
@@ -123,12 +95,10 @@ static int __init init_static_idmap(void)
identity_mapping_add(idmap_pgd, __idmap_text_start,
__idmap_text_end, 0);
ret = init_static_idmap_hyp();
/* Flush L1 for the hardware to see this page table content */
flush_cache_louis();
return ret;
return 0;
}
early_initcall(init_static_idmap);
+1
View File
@@ -26,6 +26,7 @@
#define KVM_USER_MEM_SLOTS 32
#define KVM_COALESCED_MMIO_PAGE_OFFSET 1
#define KVM_IRQCHIP_NUM_PINS KVM_IOAPIC_NUM_PINS
/* define exit reasons from vmm to kvm*/
#define EXIT_REASON_VM_PANIC 0
-1
View File
@@ -27,7 +27,6 @@
/* Select x86 specific features in <linux/kvm.h> */
#define __KVM_HAVE_IOAPIC
#define __KVM_HAVE_IRQ_LINE
#define __KVM_HAVE_DEVICE_ASSIGNMENT
/* Architectural interrupt line count. */
#define KVM_NR_INTERRUPTS 256
+12 -2
View File
@@ -21,12 +21,11 @@ config KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on BROKEN
depends on HAVE_KVM && MODULES
# for device assignment:
depends on PCI
depends on BROKEN
select PREEMPT_NOTIFIERS
select ANON_INODES
select HAVE_KVM_IRQCHIP
select HAVE_KVM_IRQ_ROUTING
select KVM_APIC_ARCHITECTURE
select KVM_MMIO
---help---
@@ -50,6 +49,17 @@ config KVM_INTEL
Provides support for KVM on Itanium 2 processors equipped with the VT
extensions.
config KVM_DEVICE_ASSIGNMENT
bool "KVM legacy PCI device assignment support"
depends on KVM && PCI && IOMMU_API
default y
---help---
Provide support for legacy PCI device assignment through KVM. The
kernel now also supports a full featured userspace device driver
framework through VFIO, which supersedes much of this support.
If unsure, say Y.
source drivers/vhost/Kconfig
endif # VIRTUALIZATION

Some files were not shown because too many files have changed in this diff Show More