Commit Graph

45 Commits

Author SHA1 Message Date
Michael S. Tsirkin
615cc2211c vhost: error handling fix
vhost should set worker to NULL on cgroups attach failure,
so that we won't try to destroy the worker again on close.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-06 09:49:39 +03:00
Michael S. Tsirkin
87d6a412bd vhost: fix attach to cgroups regression
Since 2.6.36-rc1, non-root users of vhost-net fail to attach
if they are in any cgroups.

The reason is that when qemu uses vhost, vhost wants to attach
its thread to all cgroups that qemu has.  But we got the API backwards,
so a non-priveledged process (Qemu) tried to control
the priveledged one (vhost), which fails.

Fix this by switching to the new cgroup_attach_task_all,
and running it from the vhost thread.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-06 09:49:31 +03:00
Eric Dumazet
78b620ce9e vhost: stop worker only if created
Its currently illegal to call kthread_stop(NULL)

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-01 14:26:13 -07:00
Linus Torvalds
6ba74014c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
  phy/marvell: add 88ec048 support
  igb: Program MDICNFG register prior to PHY init
  e1000e: correct MAC-PHY interconnect register offset for 82579
  hso: Add new product ID
  can: Add driver for esd CAN-USB/2 device
  l2tp: fix export of header file for userspace
  can-raw: Fix skb_orphan_try handling
  Revert "net: remove zap_completion_queue"
  net: cleanup inclusion
  phy/marvell: add 88e1121 interface mode support
  u32: negative offset fix
  net: Fix a typo from "dev" to "ndev"
  igb: Use irq_synchronize per vector when using MSI-X
  ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
  e1000e: Fix irq_synchronize in MSI-X case
  e1000e: register pm_qos request on hardware activation
  ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
  net: Add getsockopt support for TCP thin-streams
  cxgb4: update driver version
  cxgb4: add new PCI IDs
  ...

Manually fix up conflicts in:
 - drivers/net/e1000e/netdev.c: due to pm_qos registration
   infrastructure changes
 - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
   and cleaning up the IDs
 - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
   conflict (registration change vs marking it static)
2010-08-04 11:47:58 -07:00
David Stevens
8dd014adfe vhost-net: mergeable buffers support
This adds support for mergeable buffers in vhost-net: this is needed
for older guests without indirect buffer support, as well
as for zero copy with some devices.

Includes changes by Michael S. Tsirkin to make the
patch as low risk as possible (i.e., close to no changes
when feature is disabled).

Signed-off-by: David Stevens <dlstevens@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-07-28 15:45:36 +03:00
Michael S. Tsirkin
9e3d195720 vhost: apply cgroup to vhost workers
Apply the cgroup of the owner task to the created vhost worker.

Based on patches from Sridhar Samudrala's and Tejun Heo.
Later we'll need to also apply cpumask and probably priority
of the owner process.

Discussion on the best way to do this is still ongoing.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
2010-07-28 15:45:18 +03:00
Tejun Heo
c23f3445e6 vhost: replace vhost_workqueue with per-vhost kthread
Replace vhost_workqueue with per-vhost kthread.  Other than callback
argument change from struct work_struct * to struct vhost_work *,
there's no visible change to vhost_poll_*() interface.

This conversion is to make each vhost use a dedicated kthread so that
resource control via cgroup can be applied.

Partially based on Sridhar Samudrala's patch.

* Updated to use sub structure vhost_work instead of directly using
  vhost_poll at Michael's suggestion.

* Added flusher wake_up() optimization at Michael's suggestion.

Changes by MST:
* Converted atomics/barrier use to a spinlock.
* Create thread on SET_OWNER
* Fix flushing

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Sridhar Samudrala <samudrala.sridhar@gmail.com>
2010-07-28 15:44:53 +03:00
David S. Miller
4c3e5edf2f vhost net: Fix warning.
Reported by Stephen Rothwell.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-21 21:09:23 -07:00
David S. Miller
11fe883936 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/vhost/net.c
	net/bridge/br_device.c

Fix merge conflict in drivers/vhost/net.c with guidance from
Stephen Rothwell.

Revert the effects of net-2.6 commit 573201f36f
since net-next-2.6 has fixes that make bridge netpoll work properly thus
we don't need it disabled.

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-07-20 18:25:24 -07:00
Linus Torvalds
516bd66415 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
  bridge: Partially disable netpoll support
  tcp: fix crash in tcp_xmit_retransmit_queue
  IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input())
  ibmveth: lost IRQ while closing/opening device leads to service loss
  rt2x00: Fix lockdep warning in rt2x00lib_probe_dev()
  vhost: avoid pr_err on condition guest can trigger
  ipmr: Don't leak memory if fib lookup fails.
  vhost-net: avoid flush under lock
  net: fix problem in reading sock TX queue
  net/core: neighbour update Oops
  net: skb_tx_hash() fix relative to skb_orphan_try()
  rfs: call sock_rps_record_flow() in tcp_splice_read()
  xfrm: do not assume that template resolving always returns xfrms
  hostap_pci: set dev->base_addr during probe
  axnet_cs: use spin_lock_irqsave in ax_interrupt
  dsa: Fix Kconfig dependencies.
  act_nat: not all of the ICMP packets need an IP header payload
  r8169: incorrect identifier for a 8168dp
  Phonet: fix skb leak in pipe endpoint accept()
  Bluetooth: Update sec_level/auth_type for already existing connections
  ...
2010-07-20 16:26:42 -07:00
Michael S. Tsirkin
95c0ec6a97 vhost: avoid pr_err on condition guest can trigger
Guest can trigger packet truncation by posting
a very short buffer and disabling buffer merging.
Convert pr_err to pr_debug to avoid log from filling
up when this happens.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-07-16 12:18:17 +03:00
Michael S. Tsirkin
1680e9063e vhost-net: avoid flush under lock
We flush under vq mutex when changing backends.
This creates a deadlock as workqueue being flushed
needs this lock as well.

https://bugzilla.redhat.com/show_bug.cgi?id=612421

Drop the vq mutex before flush: we have the device mutex
which is sufficient to prevent another ioctl from touching
the vq.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-07-15 15:26:12 +03:00
Linus Torvalds
2aa72f6121 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits)
  NET: SB1250: Initialize .owner
  vxge: show startup message with KERN_INFO
  ll_temac: Fix missing iounmaps
  bridge: Clear IPCB before possible entry into IP stack
  bridge br_multicast: BUG: unable to handle kernel NULL pointer dereference
  net: Fix definition of netif_vdbg() when VERBOSE_DEBUG is defined
  net/ne: fix memory leak in ne_drv_probe()
  xfrm: fix xfrm by MARK logic
  virtio_net: fix oom handling on tx
  virtio_net: do not reschedule rx refill forever
  s2io: resolve statistics issues
  linux/net.h: fix kernel-doc warnings
  net: decreasing real_num_tx_queues needs to flush qdisc
  sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock
  qlge: fix a eeh handler to not add a pending timer
  qlge: Replacing add_timer() to mod_timer()
  usbnet: Set parent device early for netdev_printk()
  net: Revert "rndis_host: Poll status channel before control channel"
  netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT
  drivers: bluetooth: bluecard_cs.c: Fixed include error, changed to linux/io.h
  ...
2010-07-07 19:56:00 -07:00
David S. Miller
597e608a84 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-07-07 15:59:38 -07:00
Michael S. Tsirkin
7b3384fc30 vhost: add unlikely annotations to error path
patch 'break out of polling loop on error' caused
a minor performance regression on my machine: recover
that performance by adding a bunch of unlikely annotations
in the error handling.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-07-01 19:25:40 +03:00
Michael S. Tsirkin
d5675bd204 vhost: break out of polling loop on error
When ring parsing fails, we currently handle this
as ring empty condition. This means that we enable
kicks and recheck ring empty: if this not empty,
we re-start polling which of course will fail again.

Instead, let's return a negative error code and stop polling.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-06-27 11:52:25 +03:00
Alan Cox
79907d89c3 misc: Fix allocation 'borrowed' by vhost_net
10, 233 is allocated officially to /dev/kmview which is shipping in
Ubuntu and Debian distributions.  vhost_net seem to have borrowed it
without making a proper request and this causes regressions in the other
distributions.

vhost_net can use a dynamic minor so use that instead.  Also update the
file with a comment to try and avoid future misunderstandings.

cc: stable@kernel.org
Signed-off-by: Alan Cox <device@lanana.org>
[ We should have caught this before 2.6.34 got released.  - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-06-09 08:50:31 -07:00
David S. Miller
0dea7c12fc Merge branch 'vhost-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost 2010-06-02 08:26:36 -07:00
Linus Torvalds
72da3bc0cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
  netlink: bug fix: wrong size was calculated for vfinfo list blob
  netlink: bug fix: don't overrun skbs on vf_port dump
  xt_tee: use skb_dst_drop()
  netdev/fec: fix ifconfig eth0 down hang issue
  cnic: Fix context memory init. on 5709.
  drivers/net: Eliminate a NULL pointer dereference
  drivers/net/hamradio: Eliminate a NULL pointer dereference
  be2net: Patch removes redundant while statement in loop.
  ipv6: Add GSO support on forwarding path
  net: fix __neigh_event_send()
  vhost: fix the memory leak which will happen when memory_access_ok fails
  vhost-net: fix to check the return value of copy_to/from_user() correctly
  vhost: fix to check the return value of copy_to/from_user() correctly
  vhost: Fix host panic if ioctl called with wrong index
  net: fix lock_sock_bh/unlock_sock_bh
  net/iucv: Add missing spin_unlock
  net: ll_temac: fix checksum offload logic
  net: ll_temac: fix interrupt bug when interrupt 0 is used
  sctp: dubious bitfields in sctp_transport
  ipmr: off by one in __ipmr_fill_mroute()
  ...
2010-05-28 10:18:40 -07:00
Takuya Yoshikawa
a02c37891a vhost: fix the memory leak which will happen when memory_access_ok fails
We need to free newmem when vhost_set_memory() fails to complete.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 13:55:17 +03:00
Takuya Yoshikawa
d3553a5249 vhost-net: fix to check the return value of copy_to/from_user() correctly
copy_to/from_user() returns the number of bytes that could not be copied.

So we need to check if it is not zero, and in that case, we should return
the error number -EFAULT rather than directly return the return value from
copy_to/from_user().

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 13:55:13 +03:00
Takuya Yoshikawa
7ad9c9d270 vhost: fix to check the return value of copy_to/from_user() correctly
copy_to/from_user() returns the number of bytes that could not be copied.

So we need to check if it is not zero, and in that case, we should return
the error number -EFAULT rather than directly return the return value from
copy_to/from_user().

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 13:54:59 +03:00
Michael S. Tsirkin
f8322fbe04 vhost: whitespace fix
Fix up whitespace in vq_memory_access_ok.

Reported-by: Aristeu Rozanski <aris@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 12:28:03 +03:00
Jeff Dike
dd1f4078f0 vhost-net: minor cleanup
Delete a label and goto from vhost_net_set_backend
Inverting a test allows a label and goto to be eliminated.

Signed-off-by: Jeff Dike <jdike@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 12:25:37 +03:00
Tobias Klauser
373a83a699 vhost: Storage class should be before const qualifier
The C99 specification states in section 6.11.5:

The placement of a storage-class specifier other than at the beginning
of the declaration specifiers in a declaration is an obsolescent
feature.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-05-27 12:23:49 +03:00