Commit Graph

101 Commits

Author SHA1 Message Date
Michael S. Tsirkin 5fba13b5cf vhost: replace % with & on data path
We know vring num is a power of 2, so use &
to mask the high bits.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07 17:28:10 +02:00
Michael S. Tsirkin d542483876 vhost: relax log address alignment
commit 5d9a07b0de ("vhost: relax used
address alignment") fixed the alignment for the used virtual address,
but not for the physical address used for logging.

That's a mistake: alignment should clearly be the same for virtual and
physical addresses,

Cc: stable@vger.kernel.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-07 17:27:54 +02:00
Igor Mammedov 1e0994730f vhost: fix error handling for memory region alloc
callers of vhost_kvzalloc() expect the same behaviour on
allocation error as from kmalloc/vmalloc i.e. NULL return
value. So just return vzmalloc() returned value instead of
returning ERR_PTR(-ENOMEM)

Fixes: 4de7255f7d ("vhost: extend memory regions allocation to vmalloc")

Spotted-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-27 18:05:05 +03:00
Marc-André Lureau 7932c0bd77 vhost: actually track log eventfd file
While reviewing vhost log code, I found out that log_file is never
set. Note: I haven't tested the change (QEMU doesn't use LOG_FD yet).

Cc: stable@vger.kernel.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-27 18:04:58 +03:00
Igor Mammedov c9ce42f72f vhost: add max_mem_regions module parameter
it became possible to use a bigger amount of memory
slots, which is used by memory hotplug for
registering hotplugged memory.
However QEMU crashes if it's used with more than ~60
pc-dimm devices and vhost-net enabled since host kernel
in module vhost-net refuses to accept more than 64
memory regions.

Allow to tweak limit via max_mem_regions module paramemter
with default value set to 64 slots.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-13 23:17:19 +03:00
Igor Mammedov 4de7255f7d vhost: extend memory regions allocation to vmalloc
with large number of memory regions we could end up with
high order allocations and kmalloc could fail if
host is under memory pressure.
Considering that memory regions array is used on hot path
try harder to allocate using kmalloc and if it fails resort
to vmalloc.
It's still better than just failing vhost_set_memory() and
causing guest crash due to it when a new memory hotplugged
to guest.

I'll still look at QEMU side solution to reduce amount of
memory regions it feeds to vhost to make things even better,
but it doesn't hurt for kernel to behave smarter and don't
crash older QEMU's which could use large amount of memory
regions.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-13 23:17:18 +03:00
Igor Mammedov bcfeacab45 vhost: use binary search instead of linear in find_region()
For default region layouts performance stays the same
as linear search i.e. it takes around 210ns average for
translate_desc() that inlines find_region().

But it scales better with larger amount of regions,
235ns BS vs 300ns LS with 55 memory regions
and it will be about the same values when allowed number
of slots is increased to 509 like it has been done in kvm.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-07-01 10:12:12 +02:00
Greg Kurz 2751c9882b vhost: cross-endian support for legacy devices
This patch brings cross-endian support to vhost when used to implement
legacy virtio devices. Since it is a relatively rare situation, the
feature availability is controlled by a kernel config option (not set
by default).

The vq->is_le boolean field is added to cache the endianness to be
used for ring accesses. It defaults to native endian, as expected
by legacy virtio devices. When the ring gets active, we force little
endian if the device is modern. When the ring is deactivated, we
revert to the native endian default.

If cross-endian was compiled in, a vq->user_be boolean field is added
so that userspace may request a specific endianness. This field is
used to override the default when activating the ring of a legacy
device. It has no effect on modern devices.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2015-06-01 15:48:55 +02:00
Al Viro aad9a1cec7 vhost: switch vhost get_indirect() to iov_iter, kill memcpy_fromiovec()
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: kvm@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-02-04 01:34:15 -05:00
Michael S. Tsirkin 5d9a07b0de vhost: relax used address alignment
virtio 1.0 only requires used address to be 4 byte aligned,
vhost required 8 bytes (size of vring_used_elem).
Fix up vhost to match that.

Additionally, while vhost correctly requires 8 byte
alignment for log, it's unconnected to used ring:
it's a consequence that log has u64 entries.
Tweak code to make that clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-29 10:55:06 +02:00
Michael S. Tsirkin 3b1bbe8935 vhost: virtio 1.0 endian-ness support
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin 64f7f0510c vhost: switch to __get/__put_user exclusively
Most places in vhost can use __get/__put_user rather than
get/put_user since addresses are pre-validated.
This should be good for performance, but this also
will help make code sparse-clean: get/put_user macros
don't play well with __virtioXX bitwise tags.
Switch to get/put_user to __ variants everywhere in vhost.
There's one exception - for consistency switch that
as well, and add an explicit access_ok check.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-12-09 12:05:29 +02:00
Michael S. Tsirkin 47283bef7e vhost: move memory pointer to VQs
commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
    vhost: replace rcu with mutex
replaced rcu sync for memory accesses with VQ mutex locl/unlock.
This is correct since all accesses are under VQ mutex, but incomplete:
we still do useless rcu lock/unlock operations, someone might copy this
code into some other context where this won't be right.
This use of RCU is also non standard and hard to understand.
Let's copy the pointer to each VQ structure, this way
the access rules become straight-forward, and there's
no need for RCU anymore.

Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:07 +03:00
Michael S. Tsirkin ea16c51433 vhost: move acked_features to VQs
Refactor code to make sure features are only accessed
under VQ mutex. This makes everything simpler, no need
for RCU here anymore.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Michael S. Tsirkin 98f9ca0a3f vhost: replace rcu with mutex
All memory accesses are done under some VQ mutex.
So lock/unlock all VQs is a faster equivalent of synchronize_rcu()
for memory access changes.
Some guests cause a lot of these changes, so it's helpful
to make them faster.

Reported-by: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-09 16:21:06 +03:00
Zhi Yong Wu 59566b6e8c vhost: remove the dead branch
Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-06 15:22:05 -05:00
Qin Chuanyu ac9fde2474 vhost: wake up worker outside spin_lock
the wake_up_process func is included by spin_lock/unlock in
vhost_work_queue,
but it could be done outside the spin_lock.
I have test it with kernel 3.0.27 and guest suse11-sp2 using iperf,
the num as below.
                  original                 modified
thread_num  tp(Gbps)   vhost(%)  |  tp(Gbps)     vhost(%)
1           9.59        28.82    |   9.59        27.49
8           9.61        32.92    |   9.62        26.77
64          9.58        46.48    |   9.55        38.99
256         9.6         63.7     |   9.6         52.59

Signed-off-by: Chuanyu Qin <qinchuanyu@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-09-17 09:21:32 +03:00
Jason Wang c49e4e573b vhost: switch to use vhost_add_used_n()
Let vhost_add_used() to use vhost_add_used_n() to reduce the code
duplication. To avoid the overhead brought by __copy_to_user(). We will use
put_user() when one used need to be added.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-09-03 22:46:57 -04:00
Asias He 35596b2796 vhost: Include linux/uio.h instead of linux/socket.h
memcpy_fromiovec is moved from net/core/iovec.c to lib/iovec.c.
linux/uio.h provides the declaration for memcpy_fromiovec.

Include linux/uio.h instead of inux/socket.h for it.

Signed-off-by: Asias He <asias@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-20 15:05:04 -07:00
Asias He 6ac1afbf61 vhost: Make vhost a separate module
Currently, vhost-net and vhost-scsi are sharing the vhost core code.
However, vhost-scsi shares the code by including the vhost.c file
directly.

Making vhost a separate module makes it is easier to share code with
other vhost devices.

Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-07-07 17:33:44 +03:00
Asias He 6d5e6aa860 vhost: Simplify dev->vqs[i] access
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-07-07 14:38:26 +03:00
Michael S. Tsirkin 05c0535194 vhost: check owner before we overwrite ubuf_info
If device has an owner, we shouldn't touch ubuf_info
since it might be in use.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-11 02:46:21 -07:00
Michael S. Tsirkin 7542a6b0d2 vhost: drop virtio_net.h dependency
There's no net specific code in vhost.c anymore,
don't include the virtio_net.h header.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-06 14:04:06 +03:00
Asias He 54db63c2ca vhost: Export vhost_dev_set_owner
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-06 12:57:54 +03:00
Michael S. Tsirkin 150b9e51ae vhost: fix error handling in RESET_OWNER ioctl
RESET_OWNER ioctl would leave the fd in a bad state if
memory allocation failed: device is stopped
but owner is not reset. Make state changes
after allocating memory, such that a failed
ioctl has no effect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2013-05-01 10:02:54 +03:00