Commit Graph

314 Commits

Author SHA1 Message Date
Dan Carpenter fa93384f40 sched: Fix signedness bug in yield_to()
yield_to() is supposed to return -ESRCH if there is no task to
yield to, but because the type is bool that is the same as returning
true.

The only place I see which cares is kvm_vcpu_on_spin().

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Raghavendra <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Link: http://lkml.kernel.org/r/20140523102042.GA7267@mwanda
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-06-05 11:52:13 +02:00
Cornelia Huck f3f710bc64 KVM: Bump KVM_MAX_IRQ_ROUTES for s390
The maximum number for irq routes is currently 1024, which is a bit on
the small size for s390: We support up to 4 x 64k virtual devices with
up to 64 queues, and we need one route for each of the queues if we want
to operate it via irqfd.

Let's bump this to 4k on s390 for now, as this at least covers the saner
setups.

We need to find a more general solution, though, as we can't just grow
the routing table indefinitly.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-03-21 13:43:13 +01:00
Cornelia Huck 8422359877 KVM: s390: irq routing for adapter interrupts.
Introduce a new interrupt class for s390 adapter interrupts and enable
irqfds for s390.

This is depending on a new s390 specific vm capability, KVM_CAP_S390_IRQCHIP,
that needs to be enabled by userspace.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2014-03-21 13:43:00 +01:00
Takuya Yoshikawa 5befdc385d KVM: Simplify kvm->tlbs_dirty handling
When this was introduced, kvm_flush_remote_tlbs() could be called
without holding mmu_lock.  It is now acknowledged that the function
must be called before releasing mmu_lock, and all callers have already
been changed to do so.

There is no need to use smp_mb() and cmpxchg() any more.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-02-18 10:07:26 +01:00
Dominik Dingel e0ead41a6d KVM: async_pf: Provide additional direct page notification
By setting a Kconfig option, the architecture can control when
guest notifications will be presented by the apf backend.
There is the default batch mechanism, working as before, where the vcpu
thread should pull in this information.
Opposite to this, there is now the direct mechanism, that will push the
information to the guest.
This way s390 can use an already existing architecture interface.

Still the vcpu thread should call check_completion to cleanup leftovers.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 12:51:38 +01:00
Jens Freimann c05c4186bb KVM: s390: add floating irq controller
This patch adds a floating irq controller as a kvm_device.
It will be necessary for migration of floating interrupts as well
as for hardening the reset code by allowing user space to explicitly
remove all pending floating interrupts.

Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-01-30 10:25:20 +01:00
Stephen Hemminger ea0269bc34 kvm: remove dead code
The function kvm_io_bus_read_cookie is defined but never used
in current in-tree code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-01-08 19:03:00 -02:00
Stephen Hemminger 7940876e13 kvm: make local functions static
Running 'make namespacecheck' found lots of functions that
should be declared static, since only used in one file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2014-01-08 19:02:58 -02:00
Christoffer Dall 7330672bef KVM: arm-vgic: Support KVM_CREATE_DEVICE for VGIC
Support creating the ARM VGIC device through the KVM_CREATE_DEVICE
ioctl, which can then later be leveraged to use the
KVM_{GET/SET}_DEVICE_ATTR, which is useful both for setting addresses in
a more generic API than the ARM-specific one and is useful for
save/restore of VGIC state.

Adds KVM_CAP_DEVICE_CTRL to ARM capabilities.

Note that we change the check for creating a VGIC from bailing out if
any VCPUs were created, to bailing out if any VCPUs were ever run.  This
is an important distinction that shouldn't break anything, but allows
creating the VGIC after the VCPUs have been created.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-12-21 10:01:16 -08:00
Takuya Yoshikawa c08ac06ab3 KVM: Use cond_resched() directly and remove useless kvm_resched()
Since the commit 15ad7146 ("KVM: Use the scheduler preemption notifiers
to make kvm preemptible"), the remaining stuff in this function is a
simple cond_resched() call with an extra need_resched() check which was
there to avoid dropping VCPUs unnecessarily.  Now it is meaningless.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-12-13 14:23:45 +01:00
Gleb Natapov 95f328d3ad Merge branch 'kvm-ppc-queue' of git://github.com/agraf/linux-2.6 into queue
Conflicts:
	arch/powerpc/include/asm/processor.h
2013-11-04 10:20:57 +02:00
Michael S. Tsirkin 81e87e2679 kvm_host: typo fix
fix up typo in comment.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-31 22:14:23 +01:00
Alex Williamson e0f0bbc527 kvm: Create non-coherent DMA registeration
We currently use some ad-hoc arch variables tied to legacy KVM device
assignment to manage emulation of instructions that depend on whether
non-coherent DMA is present.  Create an interface for this, adapting
legacy KVM device assignment and adding VFIO via the KVM-VFIO device.
For now we assume that non-coherent DMA is possible any time we have a
VFIO group.  Eventually an interface can be developed as part of the
VFIO external user interface to query the coherency of a group.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:23 +01:00
Alex Williamson d96eb2c6f4 kvm/x86: Convert iommu_flags to iommu_noncoherent
Default to operating in coherent mode.  This simplifies the logic when
we switch to a model of registering and unregistering noncoherent I/O
with KVM.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:13 +01:00
Alex Williamson ec53500fae kvm: Add VFIO device
So far we've succeeded at making KVM and VFIO mostly unaware of each
other, but areas are cropping up where a connection beyond eventfds
and irqfds needs to be made.  This patch introduces a KVM-VFIO device
that is meant to be a gateway for such interaction.  The user creates
the device and can add and remove VFIO groups to it via file
descriptors.  When a group is added, KVM verifies the group is valid
and gets a reference to it via the VFIO external user interface.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-30 19:02:03 +01:00
Aneesh Kumar K.V 5587027ce9 kvm: Add struct kvm arg to memslot APIs
We will use that in the later patch to find the kvm ops handler

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-10-17 15:49:23 +02:00
chai wen f2e106692d KVM: Drop FOLL_GET in GUP when doing async page fault
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-15 13:43:37 +03:00
Christoffer Dall 6d9d41e574 KVM: Move gfn_to_index to x86 specific code
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures.  Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:11:44 +03:00
Paolo Bonzini 2f303b74a6 KVM: Convert kvm_lock back to non-raw spinlock
In commit e935b8372c ("KVM: Convert kvm_lock to raw_spinlock"),
the kvm_lock was made a raw lock.  However, the kvm mmu_shrink()
function tries to grab the (non-raw) mmu_lock within the scope of
the raw locked kvm_lock being held.  This leads to the following:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]

Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
Call Trace:
 [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160
 [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50
 [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
 [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0
 [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
 [<ffffffff8111824a>] balance_pgdat+0x54a/0x730
 [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
 [<ffffffff811185bf>] kswapd+0x18f/0x490
 [<ffffffff81070961>] ? get_parent_ip+0x11/0x50
 [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
 [<ffffffff81118430>] ? balance_pgdat+0x730/0x730
 [<ffffffff81060d2b>] kthread+0xdb/0xe0
 [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100
 [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
 [<ffffffff81060c50>] ? __init_kthread_worker+0x

After the previous patch, kvm_lock need not be a raw spinlock anymore,
so change it back.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:21:51 +02:00
Radim Krčmář 98fda16929 kvm: remove .done from struct kvm_async_pf
'.done' is used to mark the completion of 'async_pf_execute()', but
'cancel_work_sync()' returns true when the work was canceled, so we
use it instead.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:12 +02:00
Paolo Bonzini ba6a354154 KVM: mmu: allow page tables to be in read-only slots
Page tables in a read-only memory slot will currently cause a triple
fault because the page walker uses gfn_to_hva and it fails on such a slot.

OVMF uses such a page table; however, real hardware seems to be fine with
that as long as the accessed/dirty bits are set.  Save whether the slot
is readonly, and later check it when updating the accessed and dirty bits.

Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-17 12:52:31 +03:00
Dominik Dingel bf640876e2 KVM: s390: Make KVM_HVA_ERR_BAD usable on s390
Current common code uses PAGE_OFFSET to indicate a bad host virtual address.
As this check won't work on architectures that don't map kernel and user memory
into the same address space (e.g. s390), such architectures can now provide
their own KVM_HVA_ERR_BAD defines.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-29 09:03:53 +02:00
Takuya Yoshikawa e59dbe09f8 KVM: Introduce kvm_arch_memslots_updated()
This is called right after the memslots is updated, i.e. when the result
of update_memslots() gets installed in install_new_memslots().  Since
the memslots needs to be updated twice when we delete or move a memslot,
kvm_arch_commit_memory_region() does not correspond to this exactly.

In the following patch, x86 will use this new API to check if the mmio
generation has reached its maximum value, in which case mmio sptes need
to be flushed out.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Acked-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-07-18 12:29:25 +02:00
Cornelia Huck 126a5af520 KVM: kvm-io: support cookies
Add new functions kvm_io_bus_{read,write}_cookie() that allows users of
the kvm io infrastructure to use a cookie value to speed up lookup of a
device on an io bus.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-07-18 12:29:23 +02:00
Linus Torvalds fe489bf450 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "On the x86 side, there are some optimizations and documentation
  updates.  The big ARM/KVM change for 3.11, support for AArch64, will
  come through Catalin Marinas's tree.  s390 and PPC have misc cleanups
  and bugfixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (87 commits)
  KVM: PPC: Ignore PIR writes
  KVM: PPC: Book3S PR: Invalidate SLB entries properly
  KVM: PPC: Book3S PR: Allow guest to use 1TB segments
  KVM: PPC: Book3S PR: Don't keep scanning HPTEG after we find a match
  KVM: PPC: Book3S PR: Fix invalidation of SLB entry 0 on guest entry
  KVM: PPC: Book3S PR: Fix proto-VSID calculations
  KVM: PPC: Guard doorbell exception with CONFIG_PPC_DOORBELL
  KVM: Fix RTC interrupt coalescing tracking
  kvm: Add a tracepoint write_tsc_offset
  KVM: MMU: Inform users of mmio generation wraparound
  KVM: MMU: document fast invalidate all mmio sptes
  KVM: MMU: document fast invalidate all pages
  KVM: MMU: document fast page fault
  KVM: MMU: document mmio page fault
  KVM: MMU: document write_flooding_count
  KVM: MMU: document clear_spte_count
  KVM: MMU: drop kvm_mmu_zap_mmio_sptes
  KVM: MMU: init kvm generation close to mmio wrap-around value
  KVM: MMU: add tracepoint for check_mmio_spte
  KVM: MMU: fast invalidate all mmio sptes
  ...
2013-07-03 13:21:40 -07:00