Force irq migration path during cpu offline, is not using proper locks and
irq_chip mask/unmask routines. This will result in some races(especially
the device generating the interrupt can see some inconsistent state,
resulting in issues like stuck irq,..).
Appended patch fixes the issue by taking proper lock and encapsulating
irq_chip set_affinity() with a mask() before and an unmask() after.
This fixes a MSI irq stuck issue reported by Darrick Wong.
There are several more general bugs in this area(irq migration in the
process context). For example,
1. Possibility of missing edge triggered irq.
2. Reliable method of migrating level triggered irq in the process context.
We plan to look and close these in the near future.
Eric says:
In addition even with the fix from Suresh there is still at least one
nasty hardware race in fixup_irqs(). However we exercise that code
path rarely enough that we are unlikely to hit it in the real world,
and that race seems to have existed since the code was merged. And a
fix for that is not coming soon as it is an open investigation area
if we can fix irq migration to work outside of irq context or if
we have to rework the requirements imposed by the generic cpu hotplug
and layer on fixup_irqs(). So this may come up again.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Reported-and-tested-by: Darrick Wong <djwong@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* master.kernel.org:/home/rmk/linux-2.6-arm:
[ARM] 4449/1: more entries in arch/arm/boot/.gitignore
[ARM] 4452/1: Force the literal pool dump before reloc_end
[ARM] Update show_regs/oops register format
[ARM] Add support for pause_on_oops and display preempt/smp options
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Count timer interrupts correctly.
[MIPS] SMTC and non-SMTC kernel and modules are incompatible
[MIPS] EMMA2RH: Disable GEN_RTC, it can't possibly work.
[MIPS] Remove a duplicated local variable in test_and_clear_bit()
[MIPS] use compat_siginfo in rt_sigframe_n32
[MIPS] 20K: Handle WAIT related bugs according to errata information
[MIPS] AP/SP requires shadow registers, auto enable support.
[MIPS] Fix pb1500 reg B access
[MIPS] Alchemy: Fix wrong cast
[MIPS] remove "support for" from system type entry
[MIPS] add io_map_base to pci_controller on Cobalt
[MIPS] __ucmpdi2 arguments are unsigned long long.
Neither rtc_mips_get_time nor rtc_mips_set_time are being initialized by
the EMMA2RH setup code, so genrtc at best was a RTC dummy avoiding a few
error messages but not providing actual functionality.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We used to avoid the WAIT entirely on the 20K but really only need to do
this on early revs of the 20K. Without this a 20K was a bit of a
power hog. Well, in the lower power power hog category ;-)
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This fixes a bug which can cause corruption of the floating-point state
on return from a signal handler. If we have a signal handler that has
used the floating-point registers, and it happens to context-switch to
another task while copying the interrupted floating-point state from the
user stack into the thread struct (e.g. because of a page fault, or
because it gets preempted), the context switch code will think that the
FP registers contain valid FP state that needs to be copied into the
thread_struct, and will thus overwrite the values that the signal return
code has put into the thread_struct.
This can occur because we clear the MSR bits that indicate the presence
of valid FP state after copying the state into the thread_struct. To fix
this we just move the clearing of the MSR bits to before the copy. A
similar potential problem also occurs with the Altivec state, and this
fixes that in the same way.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Consider the prototype for gettimeofday():
int gettimofday(struct timeval *tv, struct timezone *tz);
Although it is valid to call with /either/ tv or tz being NULL, and
the C version of sys_gettimeofday() supports this, the current version
of gettimeofday() in the VDSO will SEGV if called with a NULL tv.
This adds a check for tv being NULL so that it doesn't SEGV.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Update the g5_defconfig with default settings.
This is to keep things up to date, and specifically to ensure that the
CONFIG_MACINTOSH_DRIVERS option is enabled. This also turns on
CONFIG_MSI.
Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
cc: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
In the arch/arm/boot/compressed/head.S file, the contents of the
literal pool accumulated during the relocatable code must be dumped
before reloc_end.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Register %ebx serves as the "global offset table base register" for
position-independent code. For absolute code, %ebx serves as a local
register and has no specified role in the function calling sequence. In
either case, a function must preserve the register value for the caller.
acpi_copy_wakeup_routine overrides %ebx without saving it, this may corrupt
the called data.
Kevin found that most time the value of Sx is saved in %esi, however
sometimes compiler also uses %ebx. When this happens, suspends fails since
sleep value in ebx is changed by acpi_copy_wakeup_routine.
The same funtion in X86_64 doesn't have this problem.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Looks-okay-to: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Background:
When a userspace application wants to know about machine check events, it
opens /dev/mcelog and does a read(). Usually, we found that this interface
works well, but in some cases, when the system was taking large numbers of
machine check exceptions, the read() would hang. The system would output a
soft-lockup warning, and the daemon reading from /dev/mcelog would suck up
as much of a single CPU as it could spinning in system space.
Description:
This patch fixes this bug. In particular, there was a "continue" inside a
timeout loop that presumably was intended to break out of the outer loop,
but instead caused the inner loop to continue. This patch also makes the
condition for the break-out a little more evident by changing a
!time_before to a time_after_eq.
Result:
The read() no longer hangs in this test case.
Testing:
On my system, I could replicate the bug with the following command:
# for i in `seq 15000`; do ./inject_sbe.sh; done
where inject_sbe.sh contains commands to inject a single-bit error into the
next memory write transaction.
Patch:
This patch is against git f1518a088b.
Signed-off-by: Joshua Wise <jwise@google.com>
Signed-off-by: Tim Hockin <thockin@google.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update defconfigs for ATNGW100 and ATSTK1002. This will enable the
SLUB allocator by default on both, and will enable NFS root on
ATSTK1002 (ATNGW100 had it enabled before.)
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>
The current at32ap7000 platform devices aren't declared as supporting DMA,
so that layered drivers can't tell whether they need to manage DMA.
This patch makes all those platform devices report that they support DMA.
Most do, but in a few cases this is inappropriate.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Haavard Skinnemoen <hskinnemoen@atmel.com>