Intel hosts only support syscall/sysret in long more (and only if efer.sce
is enabled), so only reload the related MSR_K6_STAR if the guest will
actually be able to use it.
This reduces vmexit cost by about 500 cycles (6400 -> 5870) on my setup.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some msrs are only used by x86_64 instructions, and are therefore
not needed when the guest is legacy mode. By not bothering to switch
them, we reduce vmexit latency by 2400 cycles (from about 8800) when
running a 32-bt guest on a 64-bit host.
Signed-off-by: Avi Kivity <avi@qumranet.com>
THe automatically switched msrs are never changed on the host (with
the exception of MSR_KERNEL_GS_BASE) and thus there is no need to save
them on every vm entry.
This reduces vmexit latency by ~400 cycles on i386 and by ~900 cycles (10%)
on x86_64.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Usually, guest page faults are detected by the kvm page fault handler,
which detects if they are shadow faults, mmio faults, pagetable faults,
or normal guest page faults.
However, in ceratin circumstances, we can detect a page fault much later.
One of these events is the following combination:
- A two memory operand instruction (e.g. movsb) is executed.
- The first operand is in mmio space (which is the fault reported to kvm)
- The second operand is in an ummaped address (e.g. a guest page fault)
The Windows 2000 installer does such an access, an promptly hangs. Fix
by adding the missing page fault injection on that path.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Some guests (Solaris) do not set up all four pdptrs, but leave some invalid.
kvm incorrectly treated these as valid page directories, pinning the
wrong pages and causing general confusion.
Fix by checking the valid bit of a pae pdpte. This closes sourceforge bug
1698922.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Solaris panics if it sees a cpu with no fpu, and it seems to rely on this
bit. Closes sourceforge bug 1698920.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The expression
sp - 6 < sp
where sp is a u16 is undefined in C since 'sp - 6' is promoted to int,
and signed overflow is undefined in C. gcc 4.2 actually warns about it.
Replace with a simpler test.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch enables the virtualization of the last branch record MSRs on
SVM if this feature is available in hardware. It also introduces a small
and simple check feature for specific SVM extensions.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
With this, we can specify that accesses to one physical memory range will
be remapped to another. This is useful for the vga window at 0xa0000 which
is used as a movable window into the (much larger) framebuffer.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Mapping a guest page to a host page is a common operation. Currently,
one has first to find the memory slot where the page belongs (gfn_to_memslot),
then locate the page itself (gfn_to_page()).
This is clumsy, and also won't work well with memory aliases. So simplify
gfn_to_page() not to require memory slot translation first, and instead do it
internally.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Functions that play around with the physical memory map
need a way to clear mappings to possibly nonexistent or
invalid memory. Both the mmu cache and the processor tlb
are cleared.
Signed-off-by: Dor Laor <dor.laor@qumranet.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
On x86, bit operations operate on a string of bits that can reside in
multiple words. For example, 'btsl %eax, (blah)' will touch the word
at blah+4 if %eax is between 32 and 63.
The x86 emulator compensates for that by advancing the operand address
by (bit offset / BITS_PER_LONG) and truncating the bit offset to the
range (0..BITS_PER_LONG-1). This has a side effect of forcing the operand
size to 8 bytes on 64-bit hosts.
Now, a 32-bit guest goes and fork()s a process. It write protects a stack
page at 0xbffff000 using the 'btr' instruction, at offset 0xffc in the page
table, with bit offset 1 (for the write permission bit).
The emulator now forces the operand size to 8 bytes as previously described,
and an innocent page table update turns into a cross-page-boundary write,
which is assumed by the mmu code not to be a page table, so it doesn't
actually clear the corresponding shadow page table entry. The guest and
host permissions are out of sync and guest memory is corrupted soon
afterwards, leading to guest failure.
Fix by not using BITS_PER_LONG as the word size; instead use the actual
operand size, so we get a 32-bit write in that case.
Note we still have to teach the mmu to handle cross-page-boundary writes
to guest page table; but for now this allows Damn Small Linux 0.4 (2.4.20)
to boot.
Signed-off-by: Avi Kivity <avi@qumranet.com>
Remove unused function
CC drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used
Signed-off-by: Michal Piotrowski <michal.k.k.piotrowski@gmail.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
When a vcpu is migrated from one cpu to another, its timestamp counter
may lose its monotonic property if the host has unsynced timestamp counters.
This can confuse the guest, sometimes to the point of refusing to boot.
As the rdtsc instruction is rather fast on AMD processors (7-10 cycles),
we can simply record the last host tsc when we drop the cpu, and adjust
the vcpu tsc offset when we detect that we've migrated to a different cpu.
Signed-off-by: Avi Kivity <avi@qumranet.com>
The kvm mmu keeps a shadow page for hugepage pdes; if several such pdes map
the same physical address, they share the same shadow page. This is a fairly
common case (kernel mappings on i386 nonpae Linux, for example).
However, if the two pdes map the same memory but with different permissions, kvm
will happily use the cached shadow page. If the access through the more
permissive pde will occur after the access to the strict pde, an endless pagefault
loop will be generated and the guest will make no progress.
Fix by making the access permissions part of the cache lookup key.
The fix allows Xen pae to boot on kvm and run guest domains.
Thanks to Jeremy Fitzhardinge for reporting the bug and testing the fix.
Signed-off-by: Avi Kivity <avi@qumranet.com>
This patch forbids the guest to execute monitor/mwait instructions on
SVM. This is necessary because the guest can execute these instructions
if they are available even if the kvm cpuid doesn't report its
existence.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>