Commit Graph

782 Commits

Author SHA1 Message Date
Wu Zongyong
ea8f17e44f vhost-vdpa: clean irqs before reseting vdpa device
Vdpa devices should be reset after unseting irqs of virtqueues, or we
will get errors when killing qemu process:

>> pi_update_irte: failed to update PI IRTE
>> irq bypass consumer (token 0000000065102a43) unregistration fails: -22

Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/a2cb60cf73be9da5c4e6399242117d8818f975ae.1636946171.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-11-24 19:00:28 -05:00
Stefano Garzarella
11708ff92c vhost/vsock: cleanup removing len variable
We can increment `total_len` directly and remove `len` since it
is no longer used for vhost_add_used().

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211122163525.294024-3-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-11-24 19:00:28 -05:00
Stefano Garzarella
49d8c5ffad vhost/vsock: fix incorrect used length reported to the guest
The "used length" reported by calling vhost_add_used() must be the
number of bytes written by the device (using "in" buffers).

In vhost_vsock_handle_tx_kick() the device only reads the guest
buffers (they are all "out" buffers), without writing anything,
so we must pass 0 as "used length" to comply virtio spec.

Fixes: 433fc58e6b ("VSOCK: Introduce vhost_vsock.ko")
Cc: stable@vger.kernel.org
Reported-by: Halil Pasic <pasic@linux.ibm.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211122163525.294024-2-sgarzare@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
2021-11-24 19:00:28 -05:00
Parav Pandit
6dbb1f1687 vdpa: Introduce and use vdpa device get, set config helpers
Subsequent patches enable get and set configuration either
via management device or via vdpa device' config ops.

This requires synchronization between multiple callers to get and set
config callbacks. Features setting also influence the layout of the
configuration fields endianness.

To avoid exposing synchronization primitives to callers, introduce
helper for setting the configuration and use it.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20211026175519.87795-2-parav@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-11-01 05:26:49 -04:00
Linus Torvalds
3bb50f8530 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
 "Fixes up some issues in rc5"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost-vdpa: Fix the wrong input in config_cb
  VDUSE: fix documentation underline warning
  Revert "virtio-blk: Add validation for block size in config space"
  vhost_vdpa: unset vq irq before freeing irq
  virtio: write back F_VERSION_1 before validate
2021-10-17 18:17:19 -10:00
Cindy Lu
bcef9356fc vhost-vdpa: Fix the wrong input in config_cb
Fix the wrong input in for config_cb. In function vhost_vdpa_config_cb,
the input cb.private was used as struct vhost_vdpa, so the input was
wrong here, fix this issue

Fixes: 776f395004 ("vhost_vdpa: Support config interrupt in vdpa")
Signed-off-by: Cindy Lu <lulu@redhat.com>
Link: https://lore.kernel.org/r/20210929090933.20465-1-lulu@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-10-13 08:42:07 -04:00
Wu Zongyong
97f854be20 vhost_vdpa: unset vq irq before freeing irq
Currently we unset vq irq after freeing irq and that will result in
error messages:

  pi_update_irte: failed to update PI IRTE
  irq bypass consumer (token 000000005a07a12b) unregistration fails: -22

This patch solves this.

Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com>
Link: https://lore.kernel.org/r/02637d38dcf4e4b836c5b3a65055fe92bf812b3b.1631687872.git.wuzongyong@linux.alibaba.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-10-13 08:35:36 -04:00
Linus Torvalds
d33bec7b3d Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vdpa fixes from Michael Tsirkin:
 "Fixes up some issues in rc1"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa: potential uninitialized return in vhost_vdpa_va_map()
  vdpa/mlx5: Avoid executing set_vq_ready() if device is reset
  vdpa/mlx5: Clear ready indication for control VQ
  vduse: Cleanup the old kernel states after reset failure
  vduse: missing error code in vduse_init()
  virtio: don't fail on !of_device_is_compatible
2021-09-28 07:27:29 -07:00
Linus Torvalds
fc0c0548c1 Merge tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  Current release - regressions:

   - vhost_net: fix OoB on sendmsg() failure

   - mlx5: bridge, fix uninitialized variable usage

   - bnxt_en: fix error recovery regression

  Current release - new code bugs:

   - bpf, mm: fix lockdep warning triggered by stack_map_get_build_id_offset()

  Previous releases - regressions:

   - r6040: restore MDIO clock frequency after MAC reset

   - tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()

   - dsa: flush switchdev workqueue before tearing down CPU/DSA ports

  Previous releases - always broken:

   - ptp: dp83640: don't define PAGE0, avoid compiler warning

   - igc: fix tunnel segmentation offloads

   - phylink: update SFP selected interface on advertising changes

   - stmmac: fix system hang caused by eee_ctrl_timer during suspend/resume

   - mlx5e: fix mutual exclusion between CQE compression and HW TS

  Misc:

   - bpf, cgroups: fix cgroup v2 fallback on v1/v2 mixed mode

   - sfc: fallback for lack of xdp tx queues

   - hns3: add option to turn off page pool feature"

* tag 'net-5.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (67 commits)
  mlxbf_gige: clear valid_polarity upon open
  igc: fix tunnel offloading
  net/{mlx5|nfp|bnxt}: Remove unnecessary RTNL lock assert
  net: wan: wanxl: define CROSS_COMPILE_M68K
  selftests: nci: replace unsigned int with int
  net: dsa: flush switchdev workqueue before tearing down CPU/DSA ports
  Revert "net: phy: Uniform PHY driver access"
  net: dsa: destroy the phylink instance on any error in dsa_slave_phy_setup
  ptp: dp83640: don't define PAGE0
  bnx2x: Fix enabling network interfaces without VFs
  Revert "Revert "ipv4: fix memory leaks in ip_cmsg_send() callers""
  tcp: fix tp->undo_retrans accounting in tcp_sacktag_one()
  net-caif: avoid user-triggerable WARN_ON(1)
  bpf, selftests: Add test case for mixed cgroup v1/v2
  bpf, selftests: Add cgroup v1 net_cls classid helpers
  bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode
  bpf: Add oversize check before call kvcalloc()
  net: hns3: fix the timing issue of VF clearing interrupt sources
  net: hns3: fix the exception when query imp info
  net: hns3: disable mac in flr process
  ...
2021-09-16 13:05:42 -07:00
Dan Carpenter
be9c6bad9b vdpa: potential uninitialized return in vhost_vdpa_va_map()
The concern here is that "ret" can be uninitialized if we hit the
"goto next" condition on every iteration through the loop.

Fixes: 41ba1b5f9d4b ("vdpa: Support transferring virtual addressing during DMA mapping")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210907073253.GB18254@kili
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-09-14 18:10:43 -04:00
Linus Torvalds
78e709522d Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:

 - vduse driver ("vDPA Device in Userspace") supporting emulated virtio
   block devices

 - virtio-vsock support for end of record with SEQPACKET

 - vdpa: mac and mq support for ifcvf and mlx5

 - vdpa: management netlink for ifcvf

 - virtio-i2c, gpio dt bindings

 - misc fixes and cleanups

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits)
  Documentation: Add documentation for VDUSE
  vduse: Introduce VDUSE - vDPA Device in Userspace
  vduse: Implement an MMU-based software IOTLB
  vdpa: Support transferring virtual addressing during DMA mapping
  vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
  vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
  vhost-iotlb: Add an opaque pointer for vhost IOTLB
  vhost-vdpa: Handle the failure of vdpa_reset()
  vdpa: Add reset callback in vdpa_config_ops
  vdpa: Fix some coding style issues
  file: Export receive_fd() to modules
  eventfd: Export eventfd_wake_count to modules
  iova: Export alloc_iova_fast() and free_iova_fast()
  virtio-blk: remove unneeded "likely" statements
  virtio-balloon: Use virtio_find_vqs() helper
  vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
  vsock_test: update message bounds test for MSG_EOR
  af_vsock: rename variables in receive loop
  virtio/vsock: support MSG_EOR bit processing
  vhost/vsock: support MSG_EOR bit processing
  ...
2021-09-11 14:48:42 -07:00
Paolo Abeni
3c4cea8fa7 vhost_net: fix OoB on sendmsg() failure.
If the sendmsg() call in vhost_tx_batch() fails, both the 'batched_xdp'
and 'done_idx' indexes are left unchanged. If such failure happens
when batched_xdp == VHOST_NET_BATCH, the next call to
vhost_net_build_xdp() will access and write memory outside the xdp
buffers area.

Since sendmsg() can only error with EBADFD, this change addresses the
issue explicitly freeing the XDP buffers batch on error.

Fixes: 0a0be13b8f ("vhost_net: batch submitting XDP buffers to underlayer sockets")
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-09 10:52:12 +01:00
Xie Yongji
d8945ec411 vdpa: Support transferring virtual addressing during DMA mapping
This patch introduces an attribute for vDPA device to indicate
whether virtual address can be used. If vDPA device driver set
it, vhost-vdpa bus driver will not pin user page and transfer
userspace virtual address instead of physical address during
DMA mapping. And corresponding vma->vm_file and offset will be
also passed as an opaque pointer.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-11-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Xie Yongji
22af48cf91 vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap()
The upcoming patch is going to support VA mapping/unmapping.
So let's factor out the logic of PA mapping/unmapping firstly
to make the code more readable.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-10-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Xie Yongji
c10fb9454a vdpa: Add an opaque pointer for vdpa_config_ops.dma_map()
Add an opaque pointer for DMA mapping.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-9-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Xie Yongji
59dfe4f1e8 vhost-iotlb: Add an opaque pointer for vhost IOTLB
Add an opaque pointer for vhost IOTLB. And introduce
vhost_iotlb_add_range_ctx() to accept it.

Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-8-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Xie Yongji
7f05630dc6 vhost-vdpa: Handle the failure of vdpa_reset()
The vdpa_reset() may fail now. This adds check to its return
value and fail the vhost_vdpa_open().

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210831103634.33-7-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Xie Yongji
0686082dbf vdpa: Add reset callback in vdpa_config_ops
This adds a new callback to support device specific reset
behavior. The vdpa bus driver will call the reset function
instead of setting status to zero during resetting.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210831103634.33-6-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 07:20:57 -04:00
Cai Huoqing
729ce5a5bd vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro
it's a nice refactor to make use of
PFN_PHYS/PFN_UP/PFN_DOWN helper macro

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210802013717.851-1-caihuoqing@baidu.com
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-06 02:25:16 -04:00
Arseny Krasnov
1af7e55511 vhost/vsock: support MSG_EOR bit processing
'MSG_EOR' handling has similar logic as 'MSG_EOM' - if bit present
in packet's header, reset it to 0. Then restore it back if packet
processing wasn't completed. Instead of bool variable for each
flag, bit mask variable was added: it has logical OR of 'MSG_EOR'
and 'MSG_EOM' if needed, to restore flags, this variable is ORed
with flags field of packet.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Link: https://lore.kernel.org/r/20210903123238.3273526-1-arseny.krasnov@kaspersky.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
2021-09-05 16:23:09 -04:00
Arseny Krasnov
9af8f10616 virtio/vsock: rename 'EOR' to 'EOM' bit.
This current implemented bit is used to mark end of messages
('EOM' - end of message), not records('EOR' - end of record).
Also rename 'record' to 'message' in implementation as it is
different things.

Signed-off-by: Arseny Krasnov <arseny.krasnov@kaspersky.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210903123109.3273053-1-arseny.krasnov@kaspersky.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-09-05 16:23:08 -04:00
Cai Huoqing
23b228cb89 vhost scsi: Convert to SPDX identifier
use SPDX-License-Identifier instead of a verbose license text

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Link: https://lore.kernel.org/r/20210821123320.734-1-caihuoqing@baidu.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-05 16:23:08 -04:00
Yunsheng Lin
723783d077 sock: remove one redundant SKB_FRAG_PAGE_ORDER macro
Both SKB_FRAG_PAGE_ORDER are defined to the same value in
net/core/sock.c and drivers/vhost/net.c.

Move the SKB_FRAG_PAGE_ORDER definition to net/core/sock.h,
as both net/core/sock.c and drivers/vhost/net.c include it,
and it seems a reasonable file to put the macro.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-26 10:46:20 +01:00
Neeraj Upadhyay
e74cfa91f4 vringh: Use wiov->used to check for read/write desc order
As __vringh_iov() traverses a descriptor chain, it populates
each descriptor entry into either read or write vring iov
and increments that iov's ->used member. So, as we iterate
over a descriptor chain, at any point, (riov/wriov)->used
value gives the number of descriptor enteries available,
which are to be read or written by the device. As all read
iovs must precede the write iovs, wiov->used should be zero
when we are traversing a read descriptor. Current code checks
for wiov->i, to figure out whether any previous entry in the
current descriptor chain was a write descriptor. However,
iov->i is only incremented, when these vring iovs are consumed,
at a later point, and remain 0 in __vringh_iov(). So, correct
the check for read and write descriptor order, to use
wiov->used.

Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Link: https://lore.kernel.org/r/1624591502-4827-1-git-send-email-neeraju@codeaurora.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:24 -04:00
Xie Yongji
f7ad318ea0 vhost: Fix the calculation in vhost_overflow()
This fixes the incorrect calculation for integer overflow
when the last address of iova range is 0xffffffff.

Fixes: ec33d031a1 ("vhost: detect 32 bit integer wrap around")
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20210728130756.97-2-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-11 06:44:15 -04:00