This patch brings cross-endian support to vhost when used to implement
legacy virtio devices. Since it is a relatively rare situation, the
feature availability is controlled by a kernel config option (not set
by default).
The vq->is_le boolean field is added to cache the endianness to be
used for ring accesses. It defaults to native endian, as expected
by legacy virtio devices. When the ring gets active, we force little
endian if the device is modern. When the ring is deactivated, we
revert to the native endian default.
If cross-endian was compiled in, a vq->user_be boolean field is added
so that userspace may request a specific endianness. This field is
used to override the default when activating the ring of a legacy
device. It has no effect on modern devices.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
The current memory accessors logic is:
- little endian if little_endian
- native endian (i.e. no byteswap) if !little_endian
If we want to fully support cross-endian vhost, we also need to be
able to convert to big endian.
Instead of changing the little_endian argument to some 3-value enum, this
patch changes the logic to:
- little endian if little_endian
- big endian if !little_endian
The native endian case is handled by all users with a trivial helper. This
patch doesn't change any functionality, nor it does add overhead.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc
vhost: replace rcu with mutex
replaced rcu sync for memory accesses with VQ mutex locl/unlock.
This is correct since all accesses are under VQ mutex, but incomplete:
we still do useless rcu lock/unlock operations, someone might copy this
code into some other context where this won't be right.
This use of RCU is also non standard and hard to understand.
Let's copy the pointer to each VQ structure, this way
the access rules become straight-forward, and there's
no need for RCU anymore.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Refactor code to make sure features are only accessed
under VQ mutex. This makes everything simpler, no need
for RCU here anymore.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since vhost_dev_init() forever return 0, some branches are never run,
therefore need to be removed.
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now, vq->private_data is always accessed under vq mutex. No need to play
the vhost rcu trick.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, vhost-net and vhost-scsi are sharing the vhost core code.
However, vhost-scsi shares the code by including the vhost.c file
directly.
Making vhost a separate module makes it is easier to share code with
other vhost devices.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If device has an owner, we shouldn't touch ubuf_info
since it might be in use.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is supposed to be removed when hdr is moved into vhost_net_virtqueue.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost.h should not depend on device specific marcos like
VHOST_NET_F_VIRTIO_NET_HDR and VIRTIO_NET_F_MRG_RXBUF.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
RESET_OWNER ioctl would leave the fd in a bad state if
memory allocation failed: device is stopped
but owner is not reset. Make state changes
after allocating memory, such that a failed
ioctl has no effect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On top of 'vhost: Allow device specific fields per vq', we can move device
specific fields to device virt queue from vhost virt queue.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is useful for any device who wants device specific fields per vq.
For example, tcm_vhost wants a per vq field to track requests which are
in flight on the vq. Also, on top of this we can add patches to move
things like ubufs from vhost.h out to net.c.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, the polling errors were ignored, which can lead following issues:
- vhost remove itself unconditionally from waitqueue when stopping the poll,
this may crash the kernel since the previous attempt of starting may fail to
add itself to the waitqueue
- userspace may think the backend were successfully set even when the polling
failed.
Solve this by:
- check poll->wqh before trying to remove from waitqueue
- report polling errors in vhost_poll_start(), tx_poll_start(), the return value
will be checked and returned when userspace want to set the backend
After this fix, there still could be a polling failure after backend is set, it
will addressed by the next patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
vring changes already do a flush internally where appropriate, so we do
not need a second flush.
It's currently not very expensive but a follow-up patch makes flush more
heavy-weight, so remove the extra flush here to avoid regressing
performance if call or kick fds are changed on data path.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Zerocopy handling code is vhost-net specific.
Move it from vhost.c/vhost.h out to net.c
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used to disable zerocopy when error rate
is high.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>