kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Romain Francoise	d04257b07f	vhost-net: don't open-code kvfree Commit `23cc5a991c` ("vhost-net: extend device allocation to vmalloc") added another open-coded version of kvfree (which is available since v3.15-rc5), nuke it. Signed-off-by: Romain Francoise <romain@orebokech.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-23 09:22:48 +03:00
Michael S. Tsirkin	47283bef7e	vhost: move memory pointer to VQs commit 2ae76693b8bcabf370b981cd00c36cd41d33fabc vhost: replace rcu with mutex replaced rcu sync for memory accesses with VQ mutex locl/unlock. This is correct since all accesses are under VQ mutex, but incomplete: we still do useless rcu lock/unlock operations, someone might copy this code into some other context where this won't be right. This use of RCU is also non standard and hard to understand. Let's copy the pointer to each VQ structure, this way the access rules become straight-forward, and there's no need for RCU anymore. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-09 16:21:07 +03:00
Michael S. Tsirkin	ea16c51433	vhost: move acked_features to VQs Refactor code to make sure features are only accessed under VQ mutex. This makes everything simpler, no need for RCU here anymore. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-09 16:21:06 +03:00
Michael S. Tsirkin	23cc5a991c	vhost-net: extend device allocation to vmalloc Michael Mueller provided a patch to reduce the size of vhost-net structure as some allocations could fail under memory pressure/fragmentation. We are still left with high order allocations though. This patch is handling the problem at the core level, allowing vhost structures to use vmalloc() if kmalloc() failed. As vmalloc() adds overhead on a critical network path, add __GFP_REPEAT to kzalloc() flags to do this fallback only when really needed. People are still looking at cleaner ways to handle the problem at the API level, probably passing in multiple iovecs. This hack seems consistent with approaches taken since then by drivers/vhost/scsi.c and net/core/dev.c Based on patch by Romain Francoise. Cc: Michael Mueller <mimu@linux.vnet.ibm.com> Signed-off-by: Romain Francoise <romain@orebokech.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2014-06-09 16:21:05 +03:00
Al Viro	09aaacf02a	vhost: don't open-code sockfd_put() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2014-04-01 23:19:10 -04:00
Michael S. Tsirkin	a39ee449f9	vhost: validate vhost_get_vq_desc return value vhost fails to validate negative error code from vhost_get_vq_desc causing a crash: we are using -EFAULT which is 0xfffffff2 as vector size, which exceeds the allocated size. The code in question was introduced in commit `8dd014adfe` vhost-net: mergeable buffers support CVE-2014-0055 Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-28 16:10:35 -04:00
Michael S. Tsirkin	d8316f3991	vhost: fix total length when packets are too short When mergeable buffers are disabled, and the incoming packet is too large for the rx buffer, get_rx_bufs returns success. This was intentional in order for make recvmsg truncate the packet and then handle_rx would detect err != sock_len and drop it. Unfortunately we pass the original sock_len to recvmsg - which means we use parts of iov not fully validated. Fix this up by detecting this overrun and doing packet drop immediately. CVE-2014-0077 Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-03-28 16:07:31 -04:00
Michael S. Tsirkin	b0c057ca7e	vhost: fix a theoretical race in device cleanup vhost_zerocopy_callback accesses VQ right after it drops a ubuf reference. In theory, this could race with device removal which waits on the ubuf kref, and crash on use after free. Do all accesses within rcu read side critical section, and synchronize on release. Since callbacks are always invoked from bh, synchronize_rcu_bh seems enough and will help release complete a bit faster. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-13 18:47:30 -05:00
Michael S. Tsirkin	0ad8b480d6	vhost: fix ref cnt checking deadlock vhost checked the counter within the refcnt before decrementing. It really wanted to know that it is the one that has the last reference, as a way to batch freeing resources a bit more efficiently. Note: we only let refcount go to 0 on device release. This works well but we now access the ref counter twice so there's a race: all users might see a high count and decide to defer freeing resources. In the end no one initiates freeing resources until the last reference is gone (which is on VM shotdown so might happen after a looooong time). Let's do what we probably should have done straight away: switch from kref to plain atomic, documenting the semantics, return the refcount value atomically after decrement, then use that to avoid the deadlock. Reported-by: Qin Chuanyu <qinchuanyu@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-02-13 18:47:30 -05:00
Zhi Yong Wu	59566b6e8c	vhost: remove the dead branch Since vhost_dev_init() forever return 0, some branches are never run, therefore need to be removed. Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-12-06 15:22:05 -05:00
Jason Wang	f7c6be404d	vhost_net: correctly limit the max pending buffers As Michael point out, We used to limit the max pending DMAs to get better cache utilization. But it was not done correctly since it was one done when there's no new buffers submitted from guest. Guest can easily exceeds the limitation by keeping sending packets. So this patch moves the check into main loop. Tests shows about 5%-10% improvement on per cpu throughput for guest tx. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:46:58 -04:00
Jason Wang	19c73b3e08	vhost_net: poll vhost queue after marking DMA is done We used to poll vhost queue before making DMA is done, this is racy if vhost thread were waked up before marking DMA is done which can result the signal to be missed. Fix this by always polling the vhost thread before DMA is done. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:46:57 -04:00
Jason Wang	ce21a02913	vhost_net: determine whether or not to use zerocopy at one time Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if upend_idx != done_idx we still set zcopy_used to true and rollback this choice later. This could be avoided by determining zerocopy once by checking all conditions at one time before. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:46:57 -04:00
Jason Wang	c92112aed3	vhost_net: use vhost_add_used_and_signal_n() in vhost_zerocopy_signal_used() We tend to batch the used adding and signaling in vhost_zerocopy_callback() which may result more than 100 used buffers to be updated in vhost_zerocopy_signal_used() in some cases. So switch to use vhost_add_used_and_signal_n() to avoid multiple calls to vhost_add_used_and_signal(). Which means much less times of used index updating and memory barriers. 2% performance improvement were seen on netperf TCP_RR test. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:46:57 -04:00
Jason Wang	094afe7d55	vhost_net: make vhost_zerocopy_signal_used() return void None of its caller use its return value, so let it return void. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-09-03 22:46:57 -04:00
Asias He	22fa90c7fb	vhost: Remove custom vhost rcu usage Now, vq->private_data is always accessed under vq mutex. No need to play the vhost rcu trick. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-07-11 15:38:40 +03:00
Asias He	2e26af79b7	vhost-net: Always access vq->private_data under vq mutex Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-07-11 15:36:45 +03:00
Asias He	0a1febf7ba	vhost: Make local function static $ make C=1 M=drivers/vhost drivers/vhost/net.c:168:5: warning: symbol 'vhost_net_set_ubuf_info' was not declared. Should it be static? drivers/vhost/net.c:194:6: warning: symbol 'vhost_net_vq_reset' was not declared. Should it be static? drivers/vhost/scsi.c:219:6: warning: symbol 'tcm_vhost_done_inflight' was not declared. Should it be static? Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-07-07 17:34:07 +03:00
Michael S. Tsirkin	c38e39c378	vhost-net: fix use-after-free in vhost_net_flush vhost_net_ubuf_put_and_wait has a confusing name: it will actually also free it's argument. Thus since commit `1280c27f8e` "vhost-net: flush outstanding DMAs on memory change" vhost_net_flush tries to use the argument after passing it to vhost_net_ubuf_put_and_wait, this results in use after free. To fix, don't free the argument in vhost_net_ubuf_put_and_wait, add an new API for callers that want to free ubufs. Acked-by: Asias He <asias@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-07-07 14:06:22 +03:00
Michael S. Tsirkin	288cfe78c8	vhost: fix ubuf_info cleanup vhost_net_clear_ubuf_info didn't clear ubuf_info after kfree, this could trigger double free. Fix this and simplify this code to make it more robust: make sure ubuf info is always freed through vhost_net_clear_ubuf_info. Reported-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:46:21 -07:00
Michael S. Tsirkin	05c0535194	vhost: check owner before we overwrite ubuf_info If device has an owner, we shouldn't touch ubuf_info since it might be in use. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-11 02:46:21 -07:00
Jason Wang	4364d5f96e	vhost_net: clear msg.control for non-zerocopy case during tx When we decide not use zero-copy, msg.control should be set to NULL otherwise macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs wrongly. Bug were introduced by commit `cedb9bdce0` (vhost-net: skip head management if no outstanding). This solves the following warnings: WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]() Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun] CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566 Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011 ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48 ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0 ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58 Call Trace: [<ffffffff81796b73>] dump_stack+0x19/0x1e [<ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0 [<ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20 [<ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net] [<ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net] [<ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net] [<ffffffff81061f46>] kthread+0xc6/0xd0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 [<ffffffff817a1aec>] ret_from_fork+0x7c/0xb0 [<ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70 Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-10 14:31:45 -07:00
Asias He	fe729a57c8	vhost-net: Cleanup vhost_ubuf and vhost_zcopy - Rename vhost_ubuf to vhost_net_ubuf - Rename vhost_zcopy_mask to vhost_net_zcopy_mask - Make funcs static Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-05-06 13:25:47 +03:00
Asias He	8570a6e72c	vhost: Move VHOST_NET_FEATURES to net.c vhost.h should not depend on device specific marcos like VHOST_NET_F_VIRTIO_NET_HDR and VIRTIO_NET_F_MRG_RXBUF. Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-05-06 13:21:00 +03:00
Asias He	b1ad8496c9	vhost-net: Free ubuf when vhost_dev_set_owner fails Signed-off-by: Asias He <asias@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2013-05-06 12:57:54 +03:00

1 2 3 4

100 Commits