Commit Graph

171 Commits

Author SHA1 Message Date
Avi Kivity 4d4ec08745 KVM: Replace read accesses of vcpu->arch.cr0 by an accessor
Since we'd like to allow the guest to own a few bits of cr0 at times, we need
to know when we access those bits.

Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:50 -03:00
Sheng Yang 17cc393596 KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_ops
Then the callback can provide the maximum supported large page level, which
is more flexible.

Also move the gb page support into x86_64 specific.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01 12:35:46 -03:00
Joerg Roedel 953899b659 KVM: SVM: Adjust tsc_offset only if tsc_unstable
The tsc_offset adjustment in svm_vcpu_load is executed
unconditionally even if Linux considers the host tsc as
stable. This causes a Linux guest detecting an unstable tsc
in any case.
This patch removes the tsc_offset adjustment if the host tsc
is stable. The guest will now get the benefit of a stable
tsc too.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:41 -03:00
Sheng Yang 4e47c7a6d7 KVM: VMX: Add instruction rdtscp support for guest
Before enabling, execution of "rdtscp" in guest would result in #UD.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:40 -03:00
Sheng Yang 0e85188049 KVM: Add cpuid_update() callback to kvm_x86_ops
Sometime, we need to adjust some state in order to reflect guest CPUID
setting, e.g. if we don't expose rdtscp to guest, we won't want to enable
it on hardware. cpuid_update() is introduced for this purpose.

Also export kvm_find_cpuid_entry() for later use.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01 12:35:40 -03:00
Linus Torvalds d0316554d3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
  m68k: rename global variable vmalloc_end to m68k_vmalloc_end
  percpu: add missing per_cpu_ptr_to_phys() definition for UP
  percpu: Fix kdump failure if booted with percpu_alloc=page
  percpu: make misc percpu symbols unique
  percpu: make percpu symbols in ia64 unique
  percpu: make percpu symbols in powerpc unique
  percpu: make percpu symbols in x86 unique
  percpu: make percpu symbols in xen unique
  percpu: make percpu symbols in cpufreq unique
  percpu: make percpu symbols in oprofile unique
  percpu: make percpu symbols in tracer unique
  percpu: make percpu symbols under kernel/ and mm/ unique
  percpu: remove some sparse warnings
  percpu: make alloc_percpu() handle array types
  vmalloc: fix use of non-existent percpu variable in put_cpu_var()
  this_cpu: Use this_cpu_xx in trace_functions_graph.c
  this_cpu: Use this_cpu_xx for ftrace
  this_cpu: Use this_cpu_xx in nmi handling
  this_cpu: Use this_cpu operations in RCU
  this_cpu: Use this_cpu ops for VM statistics
  ...

Fix up trivial (famous last words) global per-cpu naming conflicts in
	arch/x86/kvm/svm.c
	mm/slab.c
2009-12-14 09:58:24 -08:00
Jan Kiszka 3cfc3092f4 KVM: x86: Add KVM_GET/SET_VCPU_EVENTS
This new IOCTL exports all yet user-invisible states related to
exceptions, interrupts, and NMIs. Together with appropriate user space
changes, this fixes sporadic problems of vmsave/restore, live migration
and system reset.

[avi: future-proof abi by adding a flags field]

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:25 +02:00
Eduardo Habkost 3ce672d484 KVM: SVM: init_vmcb(): remove redundant save->cr0 initialization
The svm_set_cr0() call will initialize save->cr0 properly even when npt is
enabled, clearing the NW and CD bits as expected, so we don't need to
initialize it manually for npt_enabled anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Eduardo Habkost 18fa000ae4 KVM: SVM: Reset cr0 properly on vcpu reset
svm_vcpu_reset() was not properly resetting the contents of the guest-visible
cr0 register, causing the following issue:
https://bugzilla.redhat.com/show_bug.cgi?id=525699

Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine
with paging enabled, making the vcpu get a pagefault exception while trying to
run it.

Instead of setting vmcb->save.cr0 directly, the new code just resets
kvm->arch.cr0 and calls kvm_set_cr0(). The bits that were set/cleared on
vmcb->save.cr0 (PG, WP, !CD, !NW) will be set properly by svm_set_cr0().

kvm_set_cr0() is used instead of calling svm_set_cr0() directly to make sure
kvm_mmu_reset_context() is called to reset the mmu to nonpaging mode.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:21 +02:00
Jan Kiszka 6be7d3062b KVM: SVM: Cleanup NMI singlestep
Push the NMI-related singlestep variable into vcpu_svm. It's dealing
with an AMD-specific deficit, nothing generic for x86.

Acked-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

 arch/x86/include/asm/kvm_host.h |    1 -
 arch/x86/kvm/svm.c              |   12 +++++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:19 +02:00
Mark Langsdorf 565d0998ec KVM: SVM: Support Pause Filter in AMD processors
New AMD processors (Family 0x10 models 8+) support the Pause
Filter Feature.  This feature creates a new field in the VMCB
called Pause Filter Count.  If Pause Filter Count is greater
than 0 and intercepting PAUSEs is enabled, the processor will
increment an internal counter when a PAUSE instruction occurs
instead of intercepting.  When the internal counter reaches the
Pause Filter Count value, a PAUSE intercept will occur.

This feature can be used to detect contended spinlocks,
especially when the lock holding VCPU is not scheduled.
Rescheduling another VCPU prevents the VCPU seeking the
lock from wasting its quantum by spinning idly.

Experimental results show that most spinlocks are held
for less than 1000 PAUSE cycles or more than a few
thousand.  Default the Pause Filter Counter to 3000 to
detect the contended spinlocks.

Processor support for this feature is indicated by a CPUID
bit.

On a 24 core system running 4 guests each with 16 VCPUs,
this patch improved overall performance of each guest's
32 job kernbench by approximately 3-5% when combined
with a scheduler algorithm thati caused the VCPU to
sleep for a brief period. Further performance improvement
may be possible with a more sophisticated yield algorithm.

Signed-off-by: Mark Langsdorf <mark.langsdorf@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:17 +02:00
Joerg Roedel d36f19e9ec KVM: SVM: Remove nsvm_printk debugging code
With all important informations now delivered through
tracepoints we can savely remove the nsvm_printk debugging
code for nested svm.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:17 +02:00
Joerg Roedel 532a46b989 KVM: SVM: Add tracepoint for skinit instruction
This patch adds a tracepoint for the event that the guest
executed the SKINIT instruction. This information is
important because SKINIT is an SVM extenstion not yet
implemented by nested SVM and we may need this information
for debugging hypervisors that do not yet run on nested SVM.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel ec1ff79084 KVM: SVM: Add tracepoint for invlpga instruction
This patch adds a tracepoint for the event that the guest
executed the INVLPGA instruction.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel 236649de33 KVM: SVM: Add tracepoint for #vmexit because intr pending
This patch adds a special tracepoint for the event that a
nested #vmexit is injected because kvm wants to inject an
interrupt into the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:16 +02:00
Joerg Roedel 17897f3668 KVM: SVM: Add tracepoint for injected #vmexit
This patch adds a tracepoint for a nested #vmexit that gets
re-injected to the guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel d8cabddf7e KVM: SVM: Add tracepoint for nested #vmexit
This patch adds a tracepoint for every #vmexit we get from a
nested guest.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel 0ac406de8f KVM: SVM: Add tracepoint for nested vmrun
This patch adds a dedicated kvm tracepoint for a nested
vmrun.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Joerg Roedel cd3ff653ae KVM: SVM: Move INTR vmexit out of atomic code
The nested SVM code emulates a #vmexit caused by a request
to open the irq window right in the request function. This
is a bug because the request function runs with preemption
and interrupts disabled but the #vmexit emulation might
sleep. This can cause a schedule()-while-atomic bug and is
fixed with this patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:15 +02:00
Alexander Graf 8d23c46624 KVM: SVM: Notify nested hypervisor of lost event injections
If event_inj is valid on a #vmexit the host CPU would write
the contents to exit_int_info, so the hypervisor knows that
the event wasn't injected.

We don't do this in nested SVM by now which is a bug and
fixed by this patch.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:14 +02:00
Jan Kiszka 355be0b930 KVM: x86: Refactor guest debug IOCTL handling
Much of so far vendor-specific code for setting up guest debug can
actually be handled by the generic code. This also fixes a minor deficit
in the SVM part /wrt processing KVM_GUESTDBG_ENABLE.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-12-03 09:32:14 +02:00
Zachary Amsden 3230bb4707 KVM: Fix hotplug of CPUs
Both VMX and SVM require per-cpu memory allocation, which is done at module
init time, for only online cpus.

Backend was not allocating enough structure for all possible CPUs, so
new CPUs coming online could not be hardware enabled.

Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:13 +02:00
Zachary Amsden e6732a5af9 KVM: Fix printk name error in svm.c
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:13 +02:00
Joerg Roedel e935d48e1b KVM: SVM: Remove remaining occurences of rdtscll
This patch replaces them with native_read_tsc() which can
also be used in expressions and saves a variable on the
stack in this case.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:12 +02:00
Joerg Roedel 33527ad7e1 KVM: SVM: don't copy exit_int_info on nested vmrun
The exit_int_info field is only written by the hardware and
never read. So it does not need to be copied on a vmrun
emulation.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2009-12-03 09:32:11 +02:00