Commit Graph

61 Commits

Author SHA1 Message Date
Jason Wang
c70aa540c7 vhost: zerocopy: poll vq in zerocopy callback
We add used and signal guest in worker thread but did not poll the virtqueue
during the zero copy callback. This may lead the missing of adding and
signalling during zerocopy. Solve this by polling the virtqueue and let it
wakeup the worker during callback.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-05-02 18:22:25 +03:00
Michael S. Tsirkin
ca8f4fb21d skbuff: struct ubuf_info callback type safety
The skb struct ubuf_info callback gets passed struct ubuf_info
itself, not the arg value as the field name and the function signature
seem to imply. Rename the arg field to ctx to match usage,
add documentation and change the callback argument type
to make usage clear and to have compiler check correctness.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-13 13:09:19 -04:00
David S. Miller
f1e84eb3bb Merge branch 'vhost-net' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost 2012-03-23 14:46:48 -04:00
Cong Wang
c6daa7ffa8 vhost: remove the second argument of k[un]map_atomic()
Signed-off-by: Cong Wang <amwang@redhat.com>
2012-03-20 21:48:21 +08:00
Michael S. Tsirkin
ea5d404655 vhost: fix release path lockdep checks
We shouldn't hold any locks on release path. Pass a flag to
vhost_dev_cleanup to use the lockdep info correctly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
2012-02-28 09:13:22 +02:00
Nadav Har'El
d550dda192 vhost: don't forget to schedule()
This is a tiny, but important, patch to vhost.

Vhost's worker thread only called schedule() when it had no work to do, and
it wanted to go to sleep. But if there's always work to do, e.g., the guest
is running a network-intensive program like netperf with small message sizes,
schedule() was *never* called. This had several negative implications (on
non-preemptive kernels):

 1. Passing time was not properly accounted to the "vhost" process (ps and
    top would wrongly show it using zero CPU time).

 2. Sometimes error messages about RCU timeouts would be printed, if the
    core running the vhost thread didn't schedule() for a very long time.

 3. Worst of all, a vhost thread would "hog" the core. If several vhost
    threads need to share the same core, typically one would get most of the
    CPU time (and its associated guest most of the performance), while the
    others hardly get any work done.

The trivial solution is to add

	if (need_resched())
		schedule();

After doing every piece of work. This will not do the heavy schedule() all
the time, just when the timer interrupt decided a reschedule is warranted
(so need_resched returns true).

Thanks to Abel Gordon for this patch.

Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2012-02-28 09:13:19 +02:00
Michael S. Tsirkin
b834226b04 vhost: optimize interrupt enable/disable
As we now only update used ring after enabling
the backend, we can write flags with __put_user:
as that's done on data path, it matters.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-07-19 17:17:28 +03:00
Michael S. Tsirkin
75fd9edc10 vhost: fix zcopy reference counting
Fix get/put refcount imbalance with zero copy,
which caused qemu to hang forever on guest driver unload.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-07-19 13:28:34 +03:00
Jason Wang
2723feaa8e vhost: set log when updating used flags or avail event
We need to log writes when updating used flags and avail event
fields.  Otherwise the guest may see a stale value after migration and
miss notifying the host.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-07-19 13:28:34 +03:00
Jason Wang
f59281dafb vhost: init used ring after backend was set
Move the used ring initialization after backend was set. This
makes it possible to disable the backend and tweak the used ring,
then restart. This will also make it possible to log the used ring
write correctly.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-07-19 13:28:34 +03:00
Michael S. Tsirkin
bab632d69e vhost: vhost TX zero-copy support
>From: Shirley Ma <mashirle@us.ibm.com>

This adds experimental zero copy support in vhost-net,
disabled by default. To enable, set
experimental_zcopytx module option to 1.

This patch maintains the outstanding userspace buffers in the
sequence it is delivered to vhost. The outstanding userspace buffers
will be marked as done once the lower device buffers DMA has finished.
This is monitored through last reference of kfree_skb callback. Two
buffer indices are used for this purpose.

The vhost-net device passes the userspace buffers info to lower device
skb through message control. DMA done status check and guest
notification are handled by handle_tx: in the worst case is all buffers
in the vq are in pending/done status, so we need to notify guest to
release DMA done buffers first before we get any new buffers from the
vq.

One known problem is that if the guest stops submitting
buffers, buffers might never get used until some
further action, e.g. device reset. This does not
seem to affect linux guests.

Signed-off-by: Shirley <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-18 10:42:32 -07:00
Michael S. Tsirkin
8ea8cf89e1 vhost: support event index
Support the new event index feature. When acked,
utilize it to reduce the # of interrupts sent to the guest.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:15 +09:30
Rob Landley
6151658751 Correct occurrences of
- Documentation/kvm/ to Documentation/virtual/kvm
- Documentation/uml/ to Documentation/virtual/uml
- Documentation/lguest/ to Documentation/virtual/lguest
throughout the kernel source tree.

Signed-off-by: Rob Landley <rob@landley.net>
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
2011-05-06 09:27:55 -07:00
Michael S. Tsirkin
fcc042a280 vhost: copy_from_user -> __copy_from_user
copy_from_user is pretty high on perf top profile,
replacing it with __copy_from_user helps.
It's also safe because we do access_ok checks during setup.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-03-08 18:03:05 +02:00
Krishna Kumar
d47effe1be vhost: Cleanup vhost.c and net.c
Minor cleanup of vhost.c and net.c to match coding style.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-03-08 18:02:47 +02:00
Michael S. Tsirkin
0174b0c30a vhost: fix signed/unsigned comparison
To detect that a sequence number is done, we are doing math on unsigned
integers so the result is unsigned too. Not what was intended for the <=
comparison. The result is user stuck forever in flush call.
Convert to int to fix this.

Further, get rid of ({}) to make code clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2011-01-10 10:03:39 +02:00
Michael S. Tsirkin
28831ee60b vhost: better variable name in logging
We really store a page offset in write_address,
so rename it write_page to avoid confusion.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-12-09 16:00:10 +02:00
Michael S. Tsirkin
3bf9be40ff vhost: correctly set bits of dirty pages
Fix two bugs in dirty page logging:
When counting pages we should increase address by 1 instead of
VHOST_PAGE_SIZE. Make log_write() correctly process requests
that cross pages with write_address not starting at page boundary.

Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-12-09 15:39:17 +02:00
Michael S. Tsirkin
bf5e0bd27f vhost: remove unused include
vhost.c does not need to know about sockets,
don't include sock.h

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-12-09 15:39:13 +02:00
Michael S. Tsirkin
8b7347aab6 vhost: get/put_user -> __get/__put_user
We do access_ok checks on all ring values on an ioctl,
so we don't need to redo them on each access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-11-04 13:22:12 +02:00
Michael S. Tsirkin
dfe5ac5b18 vhost: copy_to_user -> __copy_to_user
We do access_ok checks at setup time, so we don't need to
redo them on each access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-11-04 13:22:11 +02:00
Michael S. Tsirkin
64e1c80748 vhost-net: batch use/unuse mm
Move use/unuse mm to vhost.c which makes it possible to batch these
operations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-11-04 13:22:11 +02:00
Michael S. Tsirkin
533a19b4b8 vhost: put mm after thread stop
makes it possible to batch use/unuse mm

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-11-04 13:22:10 +02:00
Julia Lawall
3fcedec752 drivers/vhost/vhost.c: delete double assignment
Delete successive assignments to the same location.

A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@@
expression i;
@@

*i = ...;
 i = ...;
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-10-26 20:39:30 +02:00
Linus Torvalds
5f05647dd8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
  bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
  vlan: Calling vlan_hwaccel_do_receive() is always valid.
  tproxy: use the interface primary IP address as a default value for --on-ip
  tproxy: added IPv6 support to the socket match
  cxgb3: function namespace cleanup
  tproxy: added IPv6 support to the TPROXY target
  tproxy: added IPv6 socket lookup function to nf_tproxy_core
  be2net: Changes to use only priority codes allowed by f/w
  tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
  tproxy: added tproxy sockopt interface in the IPV6 layer
  tproxy: added udp6_lib_lookup function
  tproxy: added const specifiers to udp lookup functions
  tproxy: split off ipv6 defragmentation to a separate module
  l2tp: small cleanup
  nf_nat: restrict ICMP translation for embedded header
  can: mcp251x: fix generation of error frames
  can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
  can-raw: add msg_flags to distinguish local traffic
  9p: client code cleanup
  rds: make local functions/variables static
  ...

Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
drivers/net/wireless/ath/ath9k/debug.c as per David
2010-10-23 11:47:02 -07:00