it became possible to use a bigger amount of memory
slots, which is used by memory hotplug for
registering hotplugged memory.
However QEMU crashes if it's used with more than ~60
pc-dimm devices and vhost-net enabled since host kernel
in module vhost-net refuses to accept more than 64
memory regions.
Allow to tweak limit via max_mem_regions module paramemter
with default value set to 64 slots.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
with large number of memory regions we could end up with
high order allocations and kmalloc could fail if
host is under memory pressure.
Considering that memory regions array is used on hot path
try harder to allocate using kmalloc and if it fails resort
to vmalloc.
It's still better than just failing vhost_set_memory() and
causing guest crash due to it when a new memory hotplugged
to guest.
I'll still look at QEMU side solution to reduce amount of
memory regions it feeds to vhost to make things even better,
but it doesn't hurt for kernel to behave smarter and don't
crash older QEMU's which could use large amount of memory
regions.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On device shutdown/removal, virtio drivers need to trigger a reset on
the device; if this is neglected, the virtio core will complain about
non-zero device status.
This patch resets the status when the 9p virtio driver is removed
from the system by calling vdev->config->reset on the virtio_device
to send a reset to the host virtio device.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The s390-specific virtio drivers have probably more to do with virtio
than with kvm today; let's move them out into a separate section to
reflect this and to be able to add relevant mailing lists.
CC: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The virtio_ring.h header is used in userspace programs (ie. QEMU),
too. Here we can not assume that sizeof(pointer) is the same as
sizeof(long), e.g. when compiling for Windows, so the typecast in
vring_init() should be done with (uintptr_t) instead of (unsigned long).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
drivers/scsi/virtio_scsi.c: In function 'virtscsi_probe':
drivers/scsi/virtio_scsi.c:952:11: warning: unused variable 'host_prot' [-Wunused-variable]
int err, host_prot;
^
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For default region layouts performance stays the same
as linear search i.e. it takes around 210ns average for
translate_desc() that inlines find_region().
But it scales better with larger amount of regions,
235ns BS vs 300ns LS with 55 memory regions
and it will be about the same values when allowed number
of slots is increased to 509 like it has been done in kvm.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Document VIRTIO_NET_CTRL_GUEST_OFFLOADS and the
relevant feature bits.
Will allow ethtool control of the offloads down the road.
Reported-by: Yan Vugenfirer <yan@daynix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move resource allocation from common code to legacy and modern code.
Only request resources actually used, i.e. bar0 in legacy mode and
the bar(s) specified by capabilities in modern mode.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VNET_LE flag was introduced to fix accesses to virtio 1.0 headers
that are always little-endian. It can also be used to handle the special
case of a legacy little-endian device implemented by a big-endian host.
Let's add a flag and ioctls for big-endian devices as well. If both flags
are set, little-endian wins.
Since this is isn't a common usecase, the feature is controlled by a kernel
config option (not set by default).
Both macvtap and tun are covered by this patch since they share the same
API with userland.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
This patch brings cross-endian support to vhost when used to implement
legacy virtio devices. Since it is a relatively rare situation, the
feature availability is controlled by a kernel config option (not set
by default).
The vq->is_le boolean field is added to cache the endianness to be
used for ring accesses. It defaults to native endian, as expected
by legacy virtio devices. When the ring gets active, we force little
endian if the device is modern. When the ring is deactivated, we
revert to the native endian default.
If cross-endian was compiled in, a vq->user_be boolean field is added
so that userspace may request a specific endianness. This field is
used to override the default when activating the ring of a legacy
device. It has no effect on modern devices.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The current memory accessors logic is:
- little endian if little_endian
- native endian (i.e. no byteswap) if !little_endian
If we want to fully support cross-endian vhost, we also need to be
able to convert to big endian.
Instead of changing the little_endian argument to some 3-value enum, this
patch changes the logic to:
- little endian if little_endian
- big endian if !little_endian
The native endian case is handled by all users with a trivial helper. This
patch doesn't change any functionality, nor it does add overhead.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Fixes userspace compilation error:
error: unknown type name ‘__virtio16’
__virtio16 tag;
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pull vfs fix from Al Viro:
"Off-by-one in d_walk()/__dentry_kill() race fix.
It's very hard to hit; possible in the same conditions as the original
bug, except that you need the skipped branch to contain all the
remaining evictables, so that the d_walk()-calling loop in
d_invalidate() decides there's nothing more to do and doesn't go for
another pass - otherwise that next pass will sweep the sucker.
So it's not too urgent, but seeing that the fix is obvious and the
original commit has spread into all -stable branches..."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
d_walk() might skip too much
Pull ARM fixes from Russell King:
"Three fixes this time around:
- fix a memory leak which occurs when probing performance monitoring
unit interrupts
- fix handling of non-PMD aligned end of RAM causing boot failures
- fix missing syscall trace exit path with syscall tracing enabled
causing a kernel oops in the audit code"
* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
ARM: 8357/1: perf: fix memory leak when probing PMU PPIs
ARM: fix missing syscall trace exit
ARM: 8356/1: mm: handle non-pmd-aligned end of RAM
Pull MIPS fixes from Ralf Baechle:
"MIPS fixes for 4.1 all across the tree"
* 'upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/ralf/linux:
MIPS: strnlen_user.S: Fix a CPU_DADDI_WORKAROUNDS regression
MIPS: BMIPS: Fix bmips_wr_vec()
MIPS: ath79: fix build problem if CONFIG_BLK_DEV_INITRD is not set
MIPS: Fuloong 2E: Replace CONFIG_USB_ISP1760_HCD by CONFIG_USB_ISP1760
MIPS: irq: Use DECLARE_BITMAP
ttyFDC: Fix to use native endian MMIO reads
MIPS: Fix CDMM to use native endian MMIO reads
Pull turbostat tool fixes from Len Brown:
"Just one minor kernel dependency in this batch -- added a #define to
msr-index.h"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: update version number to 4.7
tools/power turbostat: allow running without cpu0
tools/power turbostat: correctly decode of ENERGY_PERFORMANCE_BIAS
tools/power turbostat: enable turbostat to support Knights Landing (KNL)
tools/power turbostat: correctly display more than 2 threads/core