It is already known that buggy firmwares exist which report a bogus
link_spd in their config ROM bus info block. We now got the first
report of a bogus max_rom too (Freecom FireWire Hard Drive 1TB,
http://bugzilla.kernel.org/show_bug.cgi?id=12206).
I suspect other OSs only use quadlet reads to fetch the config ROM,
otherwise the firmware authors would have noticed their mistake.
Hence limit ieee1394's config ROM fetching routine to quadlets as the
safe minimum regardless of what the bus info block says.
This will potentially slow the bus reset handling by nodemgr somewhat
down. But most existing devices support only quadlet reads anyway,
hence there will often be no actual difference to before this change.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
On my HP 2510p I get the following in dmesg during near the end of most
resumes from suspend to RAM:
irq 19: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.28-rc7 #67
Call Trace:
<IRQ> [<ffffffffa00ee9e1>] ? ohci_irq_handler+0x60/0x7e9 [ohci1394]
[<ffffffff8026aa4d>] __report_bad_irq+0x38/0x87
[<ffffffff8026abaa>] note_interrupt+0x10e/0x174
[<ffffffff8026b262>] handle_fasteoi_irq+0xa7/0xd1
[<ffffffff8020eb87>] do_IRQ+0x73/0xe4
[<ffffffff8020c626>] ret_from_intr+0x0/0xa
<EOI> [<ffffffffa0012606>] ? acpi_idle_enter_bm+0x26b/0x2b2 [processor]
[<ffffffffa00125fc>] ? acpi_idle_enter_bm+0x261/0x2b2 [processor]
[<ffffffff8024f30f>] ? notifier_call_chain+0x33/0x5b
[<ffffffff803b9c64>] ? cpuidle_idle_call+0x8c/0xc4
[<ffffffff8020b312>] ? cpu_idle+0x4a/0x9a
[<ffffffff8042c5c8>] ? rest_init+0x5c/0x5e
handlers:
[<ffffffffa00ee981>] (ohci_irq_handler+0x0/0x7e9 [ohci1394])
Disabling IRQ #19
There also seems to be an interrupt storm during suspend/resume when this
happens:
19: 99968 33 IO-APIC-fasteoi ohci1394
This patch gets rid of both issues and makes the resume as a whole
significantly faster.
Signed-off-by: Frans Pop <elendil@planet.nl>
As was pointed out in http://lkml.org/lkml/2008/12/6/127, this does not
fix the cause of the interrupt storm. However, since the source of the
interrupts could not be determined yet, we make the system at least more
usable with this change.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
there's a new ptrace arch level feature in .28:
config X86_PTRACE_BTS
bool "Branch Trace Store"
it has broken fork() handling: the old DS area gets copied over into
a new task without clearing it.
Fixes exist but they came too late:
c5dee61: x86, bts: memory accounting
bf53de9: x86, bts: add fork and exit handling
and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.
Debugged-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure. Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed. This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier. The edac_poller will wake up sleepers when
it is found.
EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices. They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device. This is exactly
where edac_poller and rmmod would have a great chance to deadlock.
In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,
device_ctls_mutex would have already been grabbed by
edac_device_del_device(). So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.
edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex. Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.
Moreover, an edac_dev.work should check first if it is being removed. If
this is the case, then it should bail out immediately. Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed. The current edac_dev.op_state can
be used to serve this purpose.
The original deadlock problem and the solution have been witnessed and
tested on actual hardware. Without the solution, rmmod an edac driver
would result in below deadlock:
root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()
(hang for a moment)
INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller D 00000000 0 2030 2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod D 0ff2c9fc 0 2062 1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If cgroup_get_rootdir() failed, free_cg_links() will be called in the
failure path, but tmp_cg_links hasn't been initialized at that time.
I introduced this bug in the 2.6.27 merge window.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of the w1-gpio driver i found that in "w1.c:679
w1_slave_found()" the device id is converted to little-endian with
"cpu_to_le64()", but its not converted back to cpu format in "w1_io.c:293
w1_reset_select_slave()".
Based on a patch created by Andreas Hummel.
[akpm@linux-foundation.org: remove unneeded cast]
Reported-by: Andreas Hummel <andi_hummel@gmx.de>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch for the rtc-isl1208 driver makes it reject invalid dates.
Signed-off-by: Chris Elston <celston@katalix.com>
[a.zummo@towertech.it: added comment explaining the check]
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Hebert Valerio Riedel <hvr@gnu.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a NULL pointer dereference that would occur if the video decoder tied to
the em28xx supports the VIDIOC_INT_RESET call (for example: the cx25840 driver)
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
In each case, if the NULL test is necessary, then the dereference should be
moved below the NULL test.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
type T;
expression E;
identifier i,fld;
statement S;
@@
- T i = E->fld;
+ T i;
... when != E
when != i
if (E == NULL) S
+ i = E->fld;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>