Commit Graph

148818 Commits

Author SHA1 Message Date
Rusty Russell 713b15b378 lguest: be paranoid about guest playing with device descriptors.
We can't trust the values in the device descriptor table once the
guest has booted, so keep local copies.  They could set them to
strange values then cause us to segv (they're 8 bit values, so they
can't make our pointers go too wild).

This becomes more important with the following patches which read them.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:26:59 +09:30
Christian Borntraeger e335385373 virtio: enhance id_matching for virtio drivers
This patch allows a virtio driver to use VIRTIO_DEV_ANY_ID for the
device id. This will be used by a test module that can be bound to
any virtio device.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:40 +09:30
Christian Borntraeger c89e80168b virtio: fix id_matching for virtio drivers
This bug never appeared, since all current virtio drivers use
VIRTIO_DEV_ANY_ID for the vendor field. If a real vendor would be used,
the check in virtio_id_match is wrong - it returns 0 if
id->vendor == dev->id.vendor.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:40 +09:30
Rusty Russell 594de1dd64 virtio: handle short buffers in virtio_rng.
If the device fills less than 4 bytes of our random buffer, we'll
BUG_ON.  It's nicer to handle the case where it partially fills the
buffer (the protocol doesn't explicitly bad that).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:40 +09:30
Mike Frysinger 98e9444474 virtio_blk: add missing __dev{init,exit} markings
The remove member of the virtio_driver structure uses __devexit_p(), so
the remove function itself should be marked with __devexit.  And where
there be __devexit on the remove, so is there __devinit on the probe.

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:39 +09:30
Mark McLoughlin 9fa29b9df3 virtio: indirect ring entries (VIRTIO_RING_F_INDIRECT_DESC)
Add a new feature flag for indirect ring entries. These are ring
entries which point to a table of buffer descriptors.

The idea here is to increase the ring capacity by allowing a larger
effective ring size whereby the ring size dictates the number of
requests that may be outstanding, rather than the size of those
requests.

This should be most effective in the case of block I/O where we can
potentially benefit by concurrently dispatching a large number of
large requests. Even in the simple case of single segment block
requests, this results in a threefold increase in ring capacity.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:39 +09:30
Mark McLoughlin ee006b353f virtio: teach virtio_has_feature() about transport features
Drivers don't add transport features to their table, so we
shouldn't check these with virtio_check_driver_offered_feature().

We could perhaps add an ->offered_feature() virtio_config_op,
but that perhaps that would be overkill for a consitency check
like this.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:38 +09:30
Rusty Russell a92892825a virtio: expose features in sysfs
Each device negotiates feature bits; expose these in sysfs to help
diagnostics and debugging.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:38 +09:30
Michael S. Tsirkin 82af8ce84e virtio_pci: optional MSI-X support
This implements optional MSI-X support in virtio_pci.
MSI-X is used whenever the host supports at least 2 MSI-X
vectors: 1 for configuration changes and 1 for virtqueues.
Per-virtqueue vectors are allocated if enough vectors
available.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ whitespace, style)
2009-06-12 22:16:37 +09:30
Michael S. Tsirkin 77cf524654 virtio_pci: split up vp_interrupt
This reorganizes virtio-pci code in vp_interrupt slightly, so that
it's easier to add per-vq MSI support on top.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:37 +09:30
Michael S. Tsirkin d2a7ddda9f virtio: find_vqs/del_vqs virtio operations
This replaces find_vq/del_vq with find_vqs/del_vqs virtio operations,
and updates all drivers. This is needed for MSI support, because MSI
needs to know the total number of vectors upfront.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (+ lguest/9p compile fixes)
2009-06-12 22:16:36 +09:30
Rusty Russell 9499f5e7ed virtio: add names to virtqueue struct, mapping from devices to queues.
Add a linked list of all virtqueues for a virtio device: this helps for
debugging and is also needed for upcoming interface change.

Also, add a "name" field for clearer debug messages.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:36 +09:30
Rusty Russell ef688e151c virtio: meet virtio spec by finalizing features before using device
Virtio devices are supposed to negotiate features before they start using
the device, but the current code doesn't do this.  This is because the
driver's probe() function invariably has to add buffers to a virtqueue,
or probe the disk (virtio_blk).

This currently doesn't matter since no existing backend is strict about
the feature negotiation.  But it's possible to imagine a future feature
which completely changes how a device operates: in this case, we'd need
to acknowledge it before using the device.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:35 +09:30
Rusty Russell 20f77f5654 virtio: fix obsolete documentation on probe function
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 22:16:35 +09:30
Steven Whitehouse 3ea400581f GFS2: Remove lock_kernel from gfs2_put_super()
It is not required here.

Signed-off-by: Steven Whitehouse <swhiteho@redhat,com>
Cc: Christoph Hellwig <hch@infradead.org>
2009-06-12 13:40:47 +01:00
Peter Zijlstra 974802eaa1 perf_counter: Add forward/backward attribute ABI compatibility
Provide for means of extending the perf_counter_attr in a 'natural' way.

We allow growing the structure by appending fields at the end by specifying
the full structure size inside it.

When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
When an old kernel sees a larger (new) structure, it will verify the tail
consists of 0s, otherwise fail.

If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
native attribe size back into the provided structure.

Furthermore, add some attribute verification, so that we'll fail counter
creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
the __reserved fields).

(This ABI detail is introduced while keeping the existing syscall ABI.)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:28:52 +02:00
Peter Zijlstra bbd36e5e6a perf record: Explicity program a default counter
Up until now record has worked on the assumption that type=0, config=0
was a suitable configuration - which it is. Lets make this a little more
explicit and more readable via the use of proper symbols.

[ Impact: cleanup ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:28:52 +02:00
Peter Zijlstra 081fad8617 perf_counter: Remove PERF_TYPE_RAW special casing
The PERF_TYPE_RAW special case seems superfluous these days. Remove
it and add it to the switch() stmt like the others.

[ Impact: cleanup ]

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:28:51 +02:00
Peter Zijlstra f1a3c97905 perf_counter: PERF_TYPE_HW_CACHE is a hardware counter too
is_software_counter() was missing the new HW_CACHE category.

( This could have caused some counter scheduling artifacts
  with mixed sw and hw counters and counter groups. )

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:28:51 +02:00
Jaswinder Singh Rajput 4c921126fe powerpc, perf_counter: Fix performance counter event types
Sachin Sant reported these compiler errors:

 CC      arch/powerpc/kernel/power7-pmu.o
arch/powerpc/kernel/power7-pmu.c:297: error: PERF_COUNT_CPU_CYCLES undeclared here (not in a function)

Which happened because a last-minute rename of symbols crossed with
the Power7 support patch.

Fix this by using the new symbol names.

Reported-by: Sachin Sant <sachinp@in.ibm.com>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@ozlabs.org
LKML-Reference: <1244788494.5554.1.camel@ht.satnam>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-12 14:21:11 +02:00
Rusty Russell 5933048c69 module: cleanup FIXME comments about trimming exception table entries.
Everyone cut and paste this comment from my original one.  We now do
it generically, so cut the comments.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Amerigo Wang <amwang@redhat.com>
2009-06-12 21:47:05 +09:30
Rusty Russell ad6561dffa module: trim exception table on init free.
It's theoretically possible that there are exception table entries
which point into the (freed) init text of modules.  These could cause
future problems if other modules get loaded into that memory and cause
an exception as we'd see the wrong fixup.  The only case I know of is
kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).

Amerigo fixed this long-standing FIXME in the x86 version, but this
patch is more general.

This implements trim_init_extable(); most archs are simple since they
use the standard lib/extable.c sort code.  Alpha and IA64 use relative
addresses in their fixups, so thier trimming is a slight variation.

Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
yet it defines its own sort_extable() which overrides the one in lib.
It doesn't sort, so we have to mark deleted entries instead of
actually trimming them.

Inspired-by: Amerigo Wang <amwang@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: linux-alpha@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
2009-06-12 21:47:04 +09:30
Amerigo Wang c398df30d5 module: merge module_alloc() finally
As Christoph Hellwig suggested, module_alloc() actually can be
unified for i386 and x86_64 (of course, also UML).

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: 'Ingo Molnar' <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 21:47:03 +09:30
Amerigo Wang c0e5e10bf3 uml module: fix uml build process due to this merge
Due to the previous merge, uml needs to be fixed.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 21:47:02 +09:30
Amerigo Wang 0fdc83b950 x86 module: merge the rest functions with macros
Merge the rest functions together, with proper preprocessing directives.
Finally remove module_{32|64}.c.

Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12 21:47:01 +09:30