Commit Graph

167721 Commits

Author SHA1 Message Date
Michael S. Tsirkin
2d61ba9503 virtio: order used ring after used index read
On SMP guests, reads from the ring might bypass used index reads. This
causes guest crashes because host writes to used index to signal ring
data readiness.  Fix this by inserting rmb before used ring reads.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
2009-10-29 08:50:37 +10:30
Michael S. Tsirkin
0b22bd0ba0 virtio-pci: fix per-vq MSI-X request logic
Commit f68d24082e
in 2.6.32-rc1 broke requesting IRQs for per-VQ MSI-X vectors:
- vector number was used instead of the vector itself
- we try to request an IRQ for VQ which does not
  have a callback handler

This is a regression that causes warnings in kernel log,
potentially lower performance as we need to scan vq list,
and might cause system failure if the interrupt
requested is in fact needed by another system.

This was not noticed earlier because in most cases
we were falling back on shared interrupt for all vqs.

The warnings often look like this:

virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 28 for MSI/MSI-X
IRQ handler type mismatch for IRQ 1
current handler: i8042
Pid: 2400, comm: modprobe Tainted: G        W
2.6.32-rc3-11952-gf3ed8d8-dirty #1
Call Trace:
 [<ffffffff81072aed>] ? __setup_irq+0x299/0x304
 [<ffffffff81072ff3>] ? request_threaded_irq+0x144/0x1c1
 [<ffffffff813455af>] ? vring_interrupt+0x0/0x30
 [<ffffffff81346598>] ? vp_try_to_find_vqs+0x583/0x5c7
 [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
 [<ffffffff81346609>] ? vp_find_vqs+0x2d/0x83
 [<ffffffff81345d00>] ? vp_get+0x3c/0x4e
 [<ffffffffa0016373>] ? virtnet_probe+0x2f1/0x428 [virtio_net]
 [<ffffffffa0015188>] ? skb_recv_done+0x0/0x34 [virtio_net]
 [<ffffffffa00150d8>] ? skb_xmit_done+0x0/0x39 [virtio_net]
 [<ffffffff8110ab92>] ? sysfs_do_create_link+0xcb/0x116
 [<ffffffff81345cc2>] ? vp_get_status+0x14/0x16
 [<ffffffff81345464>] ? virtio_dev_probe+0xa9/0xc8
 [<ffffffff8122b11c>] ? driver_probe_device+0x8d/0x128
 [<ffffffff8122b206>] ? __driver_attach+0x4f/0x6f
 [<ffffffff8122b1b7>] ? __driver_attach+0x0/0x6f
 [<ffffffff8122a9f9>] ? bus_for_each_dev+0x43/0x74
 [<ffffffff8122a374>] ? bus_add_driver+0xea/0x22d
 [<ffffffff8122b4a3>] ? driver_register+0xa7/0x111
 [<ffffffffa001a000>] ? init+0x0/0xc [virtio_net]
 [<ffffffff81009051>] ? do_one_initcall+0x50/0x148
 [<ffffffff8106e117>] ? sys_init_module+0xc5/0x21a
 [<ffffffff8100af02>] ? system_call_fastpath+0x16/0x1b
virtio-pci 0000:00:03.0: irq 26 for MSI/MSI-X
virtio-pci 0000:00:03.0: irq 27 for MSI/MSI-X

Reported-by: Marcelo Tosatti <mtosatti@redhat.com>
Reported-by: Shirley Ma <xma@us.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-29 08:50:36 +10:30
Linus Torvalds
964fe080d9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
  move virtrng_remove to .devexit.text
  move virtballoon_remove to .devexit.text
  virtio_blk: Revert serial number support
  virtio: let header files include virtio_ids.h
  virtio_blk: revert QUEUE_FLAG_VIRT addition
2009-10-23 07:35:16 +09:00
Linus Torvalds
4848490c50 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
  niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
  KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
  KS8851: Fix MAC address write order
  KS8851: Add soft reset at probe time
  net: fix section mismatch in fec.c
  net: Fix struct inet_timewait_sock bitfield annotation
  tcp: Try to catch MSG_PEEK bug
  net: Fix IP_MULTICAST_IF
  bluetooth: static lock key fix
  bluetooth: scheduling while atomic bug fix
  tcp: fix TCP_DEFER_ACCEPT retrans calculation
  tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
  tcp: accept socket after TCP_DEFER_ACCEPT period
  Revert "tcp: fix tcp_defer_accept to consider the timeout"
  AF_UNIX: Fix deadlock on connecting to shutdown socket
  ethoc: clear only pending irqs
  ethoc: inline regs access
  vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
  virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
  be2net: fix support for PCI hot plug
  ...
2009-10-23 07:34:23 +09:00
Uwe Kleine-König
ff07eb897a move virtrng_remove to .devexit.text
The function virtrng_remove is used only wrapped by __devexit_p so define
it using __devexit.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:34 +10:30
Uwe Kleine-König
1e65175c2c move virtballoon_remove to .devexit.text
The function virtballoon_remove is used only wrapped by __devexit_p so
define it using __devexit.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:31 +10:30
Rusty Russell
3225beaba0 virtio_blk: Revert serial number support
This reverts "Add serial number support for virtio_blk, V4a".

Turns out that virtio_pci, lguest and s/390 all have an 8 bit limit
on virtio config space, so noone could ever use this.

This is coming back later in a cleaner form.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: john cooper <john.cooper@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
2009-10-22 16:39:30 +10:30
Christian Borntraeger
e95646c3ec virtio: let header files include virtio_ids.h
Rusty,

commit 3ca4f5ca73
    virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all "#include <linux/virtio_ids.h>" from the C
files into the header files, making the header files compatible with
the old ones.

In addition, this patch exports virtio_ids.h to userspace.

CC: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:28 +10:30
Christoph Hellwig
f8b12e513b virtio_blk: revert QUEUE_FLAG_VIRT addition
It seems like the addition of QUEUE_FLAG_VIRT caueses major performance
regressions for Fedora users:

	https://bugzilla.redhat.com/show_bug.cgi?id=509383
	https://bugzilla.redhat.com/show_bug.cgi?id=505695

while I can't reproduce those extreme regressions myself I think the flag
is wrong.

Rationale:

  QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue
  unplugged immediately.  This is not a good behaviour for at least
  qemu and kvm where we do have significant overhead for every
  I/O operations.  Even with all the latested speeups (native AIO,
  MSI support, zero copy) we can only get native speed for up to 128kb
  I/O requests we already are down to 66% of native performance for 4kb
  requests even on my laptop running the Intel X25-M SSD for which the
  QUEUE_FLAG_NONROT was designed.
  If we ever get virtio-blk overhead low enough that this flag makes
  sense it should only be set based on a feature flag set by the host.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-10-22 16:39:26 +10:30
Joyce Yu
845de8afa6 niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
Signed-off-by: Joyce Yu <joyce.yu@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-21 17:21:10 -07:00
Linus Torvalds
d995053d04 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
  dnotify: ignore FS_EVENT_ON_CHILD
  inotify: fix coalesce duplicate events into a single event in special case
  inotify: deprecate the inotify kernel interface
  fsnotify: do not set group for a mark before it is on the i_list
2009-10-22 08:28:28 +09:00
Linus Torvalds
be8db0b843 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: hp_sdc_rtc - fix test in hp_sdc_rtc_read_rt()
  Input: atkbd - consolidate force release quirks for volume keys
  Input: logips2pp - model 73 is actually TrackMan FX
  Input: i8042 - add Sony Vaio VGN-FZ240E to the nomux list
  Input: fix locking issue in /proc/bus/input/ handlers
  Input: atkbd - postpone restoring LED/repeat rate at resume
  Input: atkbd - restore resetting LED state at startup
  Input: i8042 - make pnp_data_busted variable boolean instead of int
  Input: synaptics - add another Protege M300 to rate blacklist
2009-10-22 08:27:12 +09:00
Linus Torvalds
422b42fa79 Merge branch 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Prevent kvm_init from corrupting debugfs structures
  KVM: MMU: fix pointer cast
  KVM: use proper hrtimer function to retrieve expiration time
2009-10-22 08:26:15 +09:00
Linus Torvalds
1b7607030d Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm snapshot: allow chunk size to be less than page size
  dm snapshot: use unsigned integer chunk size
  dm snapshot: lock snapshot while supplying status
  dm exception store: fix failed set_chunk_size error path
  dm snapshot: require non zero chunk size by end of ctr
  dm: dec_pending needs locking to save error value
  dm: add missing del_gendisk to alloc_dev error path
  dm log: userspace fix incorrect luid cast in userspace_ctr
  dm snapshot: free exception store on init failure
  dm snapshot: sort by chunk size to fix race
2009-10-22 08:25:36 +09:00
Rafael J. Wysocki
04bf7539c0 PM: Make warning in suspend_test_finish() less likely to happen
Increase TEST_SUSPEND_SECONDS to 10 so the warning in
suspend_test_finish() doesn't annoy the users of slower systems so much.

Also, make the warning print the suspend-resume cycle time, so that we
know why the warning actually triggered.

Patch prepared during the hacking session at the Kernel Summit in Tokyo.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22 08:23:45 +09:00
Uwe Kleine-König
af2bd9d534 mmc: at91_mci: Don't include asm/mach/mmc.h
This fixes a compile bug introduced in

	6ef297f (ARM: 5720/1: Move MMCI header to amba include dir)

That commit moved arch/arm/include/asm/mach/mmc.h to
include/linux/amba/mmci.h.  Just removing the include was enough.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Bill Gatliff <bgat@billgatliff.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Pierre Ossman <drzeus@drzeus.cx>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22 08:22:25 +09:00
Linus Torvalds
ab4ed677f3 Merge branch 'sh/for-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6
* 'sh/for-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
  sh: Kill off stray HAVE_FTRACE_SYSCALLS reference.
  sh: Remove BKL from landisk gio.
  sh: disabled cache handling fix.
  sh: Fix up single page flushing to use PAGE_SIZE.
2009-10-22 08:17:15 +09:00
Linus Torvalds
4fe71dba2f Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: aesni-intel - Fix irq_fpu_usable usage
  crypto: padlock-sha - Fix stack alignment
2009-10-22 08:16:01 +09:00
Yinghai Lu
4223a4a155 nfs: Fix nfs_parse_mount_options() kfree() leak
Fix a (small) memory leak in one of the error paths of the NFS mount
options parsing code.

Regression introduced in 2.6.30 by commit a67d18f (NFS: load the
rpc/rdma transport module automatically).

Reported-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22 08:15:23 +09:00
Earl Chew
ad3960243e fs: pipe.c null pointer dereference
This patch fixes a null pointer exception in pipe_rdwr_open() which
generates the stack trace:

> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
>  [<ffffffff802899a5>] pipe_rdwr_open+0x35/0x70
>  [<ffffffff8028125c>] __dentry_open+0x13c/0x230
>  [<ffffffff8028143d>] do_filp_open+0x2d/0x40
>  [<ffffffff802814aa>] do_sys_open+0x5a/0x100
>  [<ffffffff8021faf3>] sysenter_do_call+0x1b/0x67

The failure mode is triggered by an attempt to open an anonymous
pipe via /proc/pid/fd/* as exemplified by this script:

=============================================================
while : ; do
   { echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } &
   PID=$!
   OUT=$(ps -efl | grep 'sleep 1' | grep -v grep |
        { read PID REST ; echo $PID; } )
   OUT="${OUT%% *}"
   DELAY=$((RANDOM * 1000 / 32768))
   usleep $((DELAY * 1000 + RANDOM % 1000 ))
   echo n > /proc/$OUT/fd/1                 # Trigger defect
done
=============================================================

Note that the failure window is quite small and I could only
reliably reproduce the defect by inserting a small delay
in pipe_rdwr_open(). For example:

 static int
 pipe_rdwr_open(struct inode *inode, struct file *filp)
 {
       msleep(100);
       mutex_lock(&inode->i_mutex);

Although the defect was observed in pipe_rdwr_open(), I think it
makes sense to replicate the change through all the pipe_*_open()
functions.

The core of the change is to verify that inode->i_pipe has not
been released before attempting to manipulate it. If inode->i_pipe
is no longer present, return ENOENT to indicate so.

The comment about potentially using atomic_t for i_pipe->readers
and i_pipe->writers has also been removed because it is no longer
relevant in this context. The inode->i_mutex lock must be used so
that inode->i_pipe can be dealt with correctly.

Signed-off-by: Earl Chew <earl_chew@agilent.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-10-22 08:11:44 +09:00
Ben Dooks
b6a71bfa00 KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting
the RXCR1_AE bit by accident. This meant that all unicast frames where
being accepted by the device. Remove RXCR1_AE from this case.

Note, RXCR1_AE was also masking a problem with setting the MAC address
properly, so needs to be applied after fixing the MAC write order.

Fixes a bug reported by Doong, Ping of Micrel. This version of the
patch avoids setting RXCR1_ME for all cases.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 19:11:07 -07:00
Ben Dooks
160d0fadaf KS8851: Fix MAC address write order
The MAC address register was being written in the wrong order, so add
a new address macro to convert mac-address byte to register address and
a ks8851_wrreg8() function to write each byte without having to worry
about any difficult byte swapping.

Fixes a bug reported by Doong, Ping of Micrel.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 19:11:06 -07:00
Ben Dooks
57dada6819 KS8851: Add soft reset at probe time
Issue a full soft reset at probe time.

This was reported by Doong Ping of Micrel, but no explanation of why this
is necessary or what bug it is fixing. Add it as it does not seem to hurt
the current driver and ensures that the device is in a known state when we
start setting it up.

Signed-off-by: Ben Dooks <ben@simtec.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 19:11:05 -07:00
Steven King
78abcb13dd net: fix section mismatch in fec.c
fec_enet_init is called by both fec_probe and fec_resume, so it
shouldn't be marked as __init.

Signed-off-by: Steven King <sfking@fdwdc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-10-20 18:51:37 -07:00
Andreas Gruenbacher
945526846a dnotify: ignore FS_EVENT_ON_CHILD
Mask off FS_EVENT_ON_CHILD in dnotify_handle_event().  Otherwise, when there
is more than one watch on a directory and dnotify_should_send_event()
succeeds, events with FS_EVENT_ON_CHILD set will trigger all watches and cause
spurious events.

This case was overlooked in commit e42e2773.

	#define _GNU_SOURCE

	#include <stdio.h>
	#include <stdlib.h>
	#include <unistd.h>
	#include <signal.h>
	#include <sys/types.h>
	#include <sys/stat.h>
	#include <fcntl.h>
	#include <string.h>

	static void create_event(int s, siginfo_t* si, void* p)
	{
		printf("create\n");
	}

	static void delete_event(int s, siginfo_t* si, void* p)
	{
		printf("delete\n");
	}

	int main (void) {
		struct sigaction action;
		char *tmpdir, *file;
		int fd1, fd2;

		sigemptyset (&action.sa_mask);
		action.sa_flags = SA_SIGINFO;

		action.sa_sigaction = create_event;
		sigaction (SIGRTMIN + 0, &action, NULL);

		action.sa_sigaction = delete_event;
		sigaction (SIGRTMIN + 1, &action, NULL);

	#	define TMPDIR "/tmp/test.XXXXXX"
		tmpdir = malloc(strlen(TMPDIR) + 1);
		strcpy(tmpdir, TMPDIR);
		mkdtemp(tmpdir);

	#	define TMPFILE "/file"
		file = malloc(strlen(tmpdir) + strlen(TMPFILE) + 1);
		sprintf(file, "%s/%s", tmpdir, TMPFILE);

		fd1 = open (tmpdir, O_RDONLY);
		fcntl(fd1, F_SETSIG, SIGRTMIN);
		fcntl(fd1, F_NOTIFY, DN_MULTISHOT | DN_CREATE);

		fd2 = open (tmpdir, O_RDONLY);
		fcntl(fd2, F_SETSIG, SIGRTMIN + 1);
		fcntl(fd2, F_NOTIFY, DN_MULTISHOT | DN_DELETE);

		if (fork()) {
			/* This triggers a create event */
			creat(file, 0600);
			/* This triggers a create and delete event (!) */
			unlink(file);
		} else {
			sleep(1);
			rmdir(tmpdir);
		}

		return 0;
	}

Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Eric Paris <eparis@redhat.com>
2009-10-20 18:02:33 -04:00