Commit Graph

95 Commits

Author SHA1 Message Date
Stephen Hemminger
be6528b2e5 virtio: fix format of sysfs driver/vendor files
The sysfs files for virtio produce the wrong format and are missing
the required newline. The output for virtio bus vendor/device should
have the same format as the corresponding entries for PCI devices.

Although this technically changes the ABI for sysfs, these files were
broken to start with!

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-11-24 15:21:12 +10:30
Michael S. Tsirkin
7ae4b866f8 virtio: return correct capacity to users
We can't rely on indirect buffers for capacity
calculations because they need a memory allocation
which might fail.  In particular, virtio_net can get
into this situation under stress, and it drops packets
and performs badly.

So return the number of buffers we can guarantee users.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Reported-By: Krishna Kumar2 <krkumar2@in.ibm.com>
2010-11-24 15:21:11 +10:30
Michael S. Tsirkin
1fe9b6fef1 virtio: fix oops on OOM
virtio ring was changed to return an error code on OOM,
but one caller was missed and still checks for vq->vring.num.
The fix is just to check for <0 error code.

Long term it might make sense to change goto add_head to
just return an error on oom instead, but let's apply
a minimal fix for 2.6.35.

Reported-by: Chris Mason <chris.mason@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Chris Mason <chris.mason@oracle.com>
Cc: stable@kernel.org # .34.x
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-26 08:05:31 -07:00
Michael S. Tsirkin
b03214d559 virtio-pci: disable msi at startup
virtio-pci resets the device at startup by writing to the status
register, but this does not clear the pci config space,
specifically msi enable status which affects register
layout.

This breaks things like kdump when they try to use e.g. virtio-blk.

Fix by forcing msi off at startup. Since pci.c already has
a routine to do this, we export and use it instead of duplicating code.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: linux-pci@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2010-06-23 22:49:07 +09:30
Michael S. Tsirkin
686d363786 virtio: return ENOMEM on out of memory
add_buf returns ring size on out of memory,
this is not what devices expect.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org # .34.x
2010-06-23 22:49:06 +09:30
Linus Torvalds
1756ac3d3c Merge branch 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'virtio' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (27 commits)
  drivers/char: Eliminate use after free
  virtio: console: Accept console size along with resize control message
  virtio: console: Store each console's size in the console structure
  virtio: console: Resize console port 0 on config intr only if multiport is off
  virtio: console: Add support for nonblocking write()s
  virtio: console: Rename wait_is_over() to will_read_block()
  virtio: console: Don't always create a port 0 if using multiport
  virtio: console: Use a control message to add ports
  virtio: console: Move code around for future patches
  virtio: console: Remove config work handler
  virtio: console: Don't call hvc_remove() on unplugging console ports
  virtio: console: Return -EPIPE to hvc_console if we lost the connection
  virtio: console: Let host know of port or device add failures
  virtio: console: Add a __send_control_msg() that can send messages without a valid port
  virtio: Revert "virtio: disable multiport console support."
  virtio: add_buf_gfp
  trans_virtio: use virtqueue_xxx wrappers
  virtio-rng: use virtqueue_xxx wrappers
  virtio_ring: remove a level of indirection
  virtio_net: use virtqueue_xxx wrappers
  ...

Fix up conflicts in drivers/net/virtio_net.c due to new virtqueue_xxx
wrappers changes conflicting with some other cleanups.
2010-05-21 17:22:52 -07:00
Michael S. Tsirkin
bbd603efb4 virtio: add_buf_gfp
Add an add_buf variant that gets gfp parameter. Use that
to allocate indirect buffers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:46 +09:30
Michael S. Tsirkin
7c5e9ed0c8 virtio_ring: remove a level of indirection
We have a single virtqueue_ops implementation,
and it seems unlikely we'll get another one
at this point. So let's remove an unnecessary
level of indirection: it would be very easy to
re-add it if another implementation surfaces.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:43 +09:30
Michael S. Tsirkin
946cfe0e05 virtio_balloon: use virtqueue_xxx wrappers
Switch virtio_balloon to new virtqueue_xxx wrappers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-05-19 22:15:41 +09:30
Jiri Kosina
6c9468e9eb Merge branch 'master' into for-next 2010-04-23 02:08:44 +02:00
Balbir Singh
61fb06cc8e virtio: Fix GFP flags passed from the virtio balloon driver
The virtio balloon driver can dig into the reservation pools of the OS
to satisfy a balloon request.  This is not advisable and other balloon
drivers (drivers/xen/balloon.c) avoid this as well.

The patch also adds changes to avoid printing a warning if allocation
fails, since we retry after sometime anyway.

Signed-off-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: kvm <kvm@vger.kernel.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-22 07:34:05 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Thomas Weber
8839316121 Fix typos in comments
[Ss]ytem => [Ss]ystem
udpate => update
paramters => parameters
orginal => original

Signed-off-by: Thomas Weber <swirl@gmx.li>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-03-16 11:47:56 +01:00
Michael S. Tsirkin
bc505f3739 virtio: set pci bus master enable bit
As all virtio devices perform DMA, we
must enable bus mastering for them to be
spec compliant.

This patch fixes hotplug of virtio devices
with Linux guests and qemu 0.11-0.12.

Tested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-03-02 13:41:14 +02:00
Michael S. Tsirkin
3119815912 virtio: fix out of range array access
I have observed the following error on virtio-net module unload:

------------[ cut here ]------------
WARNING: at kernel/irq/manage.c:858 __free_irq+0xa0/0x14c()
Hardware name: Bochs
Trying to free already-free IRQ 0
Modules linked in: virtio_net(-) virtio_blk virtio_pci virtio_ring
virtio af_packet e1000 shpchp aacraid uhci_hcd ohci_hcd ehci_hcd [last
unloaded: scsi_wait_scan]
Pid: 1957, comm: rmmod Not tainted 2.6.33-rc8-vhost #24
Call Trace:
 [<ffffffff8103e195>] warn_slowpath_common+0x7c/0x94
 [<ffffffff8103e204>] warn_slowpath_fmt+0x41/0x43
 [<ffffffff810a7a36>] ? __free_pages+0x5a/0x70
 [<ffffffff8107cc00>] __free_irq+0xa0/0x14c
 [<ffffffff8107cceb>] free_irq+0x3f/0x65
 [<ffffffffa0081424>] vp_del_vqs+0x81/0xb1 [virtio_pci]
 [<ffffffffa0091d29>] virtnet_remove+0xda/0x10b [virtio_net]
 [<ffffffffa0075200>] virtio_dev_remove+0x22/0x4a [virtio]
 [<ffffffff812709ee>] __device_release_driver+0x66/0xac
 [<ffffffff81270ab7>] driver_detach+0x83/0xa9
 [<ffffffff8126fc66>] bus_remove_driver+0x91/0xb4
 [<ffffffff81270fcf>] driver_unregister+0x6c/0x74
 [<ffffffffa0075418>] unregister_virtio_driver+0xe/0x10 [virtio]
 [<ffffffffa0091c4d>] fini+0x15/0x17 [virtio_net]
 [<ffffffff8106997b>] sys_delete_module+0x1c3/0x230
 [<ffffffff81007465>] ? old_ich_force_enable_hpet+0x117/0x164
 [<ffffffff813bb720>] ? do_page_fault+0x29c/0x2cc
 [<ffffffff81028e58>] sysenter_dispatch+0x7/0x27
---[ end trace 15e88e4c576cc62b ]---

The bug is in virtio-pci: we use msix_vector as array index to get irq
entry, but some vqs do not have a dedicated vector so this causes an out
of bounds access.  By chance, we seem to often get 0 value, which
results in this error.

Fix by verifying that vector is legal before using it as index.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Shirley Ma <xma@us.ibm.com>
Acked-by: Amit Shah <amit.shah@redhat.com>
2010-02-28 20:39:11 +02:00
Amit Shah
3b8706240e virtio: Initialize vq->data entries to NULL
vq operations depend on vq->data[i] being NULL to figure out if the vq
entry is in use (since the previous patch).

We have to initialize them to NULL to ensure we don't work with junk
data and trigger false BUG_ONs.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Shirley Ma <xma@us.ibm.com>
2010-02-24 14:22:29 +10:30
Shirley Ma
c021eac414 virtio: Add ability to detach unused buffers from vrings
There's currently no way for a virtio driver to ask for unused
buffers, so it has to keep a list itself to reclaim them at shutdown.
This is redundant, since virtio_ring stores that information.  So
add a new hook to do this.

Signed-off-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:27 +10:30
Michael S. Tsirkin
d57ed95da4 virtio: use smp_XX barriers on SMP
virtio is communicating with a virtual "device" that actually runs on
another host processor. Thus SMP barriers can be used to control
memory access ordering.

Where possible, we should use SMP barriers which are more lightweight than
mandatory barriers, because mandatory barriers also control MMIO effects on
accesses through relaxed memory I/O windows (which virtio does not use)
(compare specifically smp_rmb and rmb on x86_64).

We can't just use smp_mb and friends though, because
we must force memory ordering even if guest is UP since host could be
running on another CPU, but SMP barriers are defined to barrier() in
that configuration. So, for UP fall back to mandatory barriers instead.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:25 +10:30
Rusty Russell
97a545ab6c virtio: remove bogus barriers from DEBUG version of virtio_ring.c
With DEBUG defined, we add an ->in_use flag to detect if the caller
invokes two virtio methods in parallel.  The barriers attempt to ensure
timely update of the ->in_use flag.

But they're voodoo: if we need these barriers it implies that the
calling code doesn't have sufficient synchronization to ensure the
code paths aren't invoked at the same time anyway, and we want to
detect it.

Also, adding barriers changes timing, so turning on debug has more
chance of hiding real problems.

Thanks to MST for drawing my attention to this code...

CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:24 +10:30
Rusty Russell
169c246a30 virtio: fix balloon without VIRTIO_BALLOON_F_STATS_VQ
When running under qemu-kvm-0.11.0:

	BUG: unable to handle kernel paging request at 56e58955
	...
	Process vballoon (pid: 1297, ti=c7976000 task=c70a6ca0 task.ti=c7
	...
	Call Trace:
	 [<c88253a3>] ? balloon+0x1b3/0x440 [virtio_balloon]
	 [<c041c2d7>] ? schedule+0x327/0x9d0
	 [<c88251f0>] ? balloon+0x0/0x440 [virtio_balloon]
	 [<c014a2d4>] ? kthread+0x74/0x80
	 [<c014a260>] ? kthread+0x0/0x80
	 [<c0103b36>] ? kernel_thread_helper+0x6/0x30

need_stats_update should be zero-initialized.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Adam Litke <agl@us.ibm.com>
2010-02-24 14:22:17 +10:30
Adam Litke
1f34c71afe virtio: Fix scheduling while atomic in virtio_balloon stats
This is a fix for my earlier patch: "virtio: Add memory statistics reporting to
the balloon driver (V4)".

I discovered that all_vm_events() can sleep and therefore stats collection
cannot be done in interrupt context.  One solution is to handle the interrupt
by noting that stats need to be collected and waking the existing vballoon
kthread which will complete the work via stats_handle_request().  Rusty, is
this a saner way of doing business?

There is one issue that I would like a broader opinion on.  In stats_request, I
update vb->need_stats_update and then wake up the kthread.  The kthread uses
vb->need_stats_update as a condition variable.  Do I need a memory barrier
between the update and wake_up to ensure that my kthread sees the correct
value?  My testing suggests that it is not needed but I would like some
confirmation from the experts.

Signed-off-by: Adam Litke <agl@us.ibm.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Anthony Liguori <aliguori@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:14 +10:30
Adam Litke
9564e138b1 virtio: Add memory statistics reporting to the balloon driver (V4)
Changes since V3:
 - Do not do endian conversions as they will be done in the host
 - Report stats that reference a quantity of memory in bytes
 - Minor coding style updates

Changes since V2:
 - Increase stat field size to 64 bits
 - Report all sizes in kb (not pages)
 - Drop anon_pages stat and fix endianness conversion

Changes since V1:
 - Use a virtqueue instead of the device config space

When using ballooning to manage overcommitted memory on a host, a system for
guests to communicate their memory usage to the host can provide information
that will minimize the impact of ballooning on the guests.  The current method
employs a daemon running in each guest that communicates memory statistics to a
host daemon at a specified time interval.  The host daemon aggregates this
information and inflates and/or deflates balloons according to the level of
host memory pressure.  This approach is effective but overly complex since a
daemon must be installed inside each guest and coordinated to communicate with
the host.  A simpler approach is to collect memory statistics in the virtio
balloon driver and communicate them directly to the hypervisor.

This patch enables the guest-side support by adding stats collection and
reporting to the virtio balloon driver.

Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor fixes)
2010-02-24 14:22:08 +10:30
Jamie Lokier
1f08b833dd Add __devexit_p around reference to virtio_pci_remove
This is needed to compile with CONFIG_VIRTIO_PCI=y,
because virtio_pci_remove is marked __devexit.

Signed-off-by: Jamie Lokier <jamie@shareable.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2010-02-24 14:22:04 +10:30
Jeff Mahoney
d817cd5255 virtio: fix section mismatch warnings
Fix fixes the following warnings by renaming the driver structures to be
suffixed with _driver.

WARNING: drivers/virtio/virtio_balloon.o(.data+0x88): Section mismatch in reference from the variable virtio_balloon to the function .devexit.text:virtballoon_remove()

WARNING: drivers/char/hw_random/virtio-rng.o(.data+0x88): Section mismatch in reference from the variable virtio_rng to the function .devexit.text:virtrng_remove()

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-01-16 12:15:39 -08:00
Michael S. Tsirkin
2d61ba9503 virtio: order used ring after used index read
On SMP guests, reads from the ring might bypass used index reads. This
causes guest crashes because host writes to used index to signal ring
data readiness.  Fix this by inserting rmb before used ring reads.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:50:37 +10:30