block: Backport of various I/O topology fixes from 2.6.33 and 2.6.34
The stacking code incorrectly scaled up the data offset in some cases
causing misaligned devices to report alignment. Rewrite the stacking
algorithm to remedy this.
(Upstream commit 9504e0864b)
The top device misalignment flag would not be set if the added bottom
device was already misaligned as opposed to causing a stacking failure.
Also massage the reporting so that an error is only returned if adding
the bottom device caused the misalignment. I.e. don't return an error
if the top is already flagged as misaligned.
(Upstream commit fe0b393f2c)
lcm() was defined to take integer-sized arguments. The supplied
arguments are multiplied, however, causing us to overflow given
sufficiently large input. That in turn led to incorrect optimal I/O
size reporting in some cases. Switch lcm() over to unsigned long
similar to gcd() and move the function from blk-settings.c to lib.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit d2e7276b6b upstream.
This is retry of reverted 859ddf0974
("idr: fix a critical misallocation bug") which contained two bugs.
* pa[idp->layers] should be cleared even if it's not used by
sub_alloc() because it's used by mark idr_mark_full().
* The original condition check also assigned pa[l] to p which the new
code didn't do thus leaving p pointing at the wrong layer.
Both problems have been fixed and the idr code has received good amount
testing using userland testing setup where simple bitmap allocator is
run parallel to verify the result of idr allocation.
The bug this patch fixes is caused by sub_alloc() optimization path
bypassing out-of-room condition check and restarting allocation loop
with starting value higher than maximum allowed value. For detailed
description, please read commit message of 859ddf09.
Signed-off-by: Tejun Heo <tj@kernel.org>
Based-on-patch-from: Eric Paris <eparis@redhat.com>
Reported-by: Eric Paris <eparis@redhat.com>
Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Tested-by: Serge Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit a8fe9ea200 upstream.
Stephen Rothwell reported the following build warning:
lib/dma-debug.c: In function 'dma_debug_device_change':
lib/dma-debug.c:680: warning: 'return' with no value, in function returning non-void
Introduced by commit f797d9881b
("dma-debug: Do not add notifier when dma debugging is disabled").
Return 0 [notify-done] when disabled. (this is standard bus notifier behavior.)
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20091231125624.GA14666@liondog.tnic>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
commit f797d9881b upstream.
If CONFIG_HAVE_DMA_API_DEBUG is defined and "dma_debug=off" is
specified on the kernel command line, when you detach a driver from a
device you can cause the following NULL pointer dereference:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<c0580d35>] dma_debug_device_change+0x5d/0x117
The problem is that the dma_debug_device_change notifier function is
added to the bus notifier chain even though the dma_entry_hash array
was never initialized. If dma debugging is disabled, this patch both
prevents dma_debug_device_change notifiers from being added to the
chain, and additionally ensures that the dma_debug_device_change
notifier function is a no-op.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Don't delete pending pages from the page-store tracking tree, but rather send
them for another write as they've presumably been updated.
Signed-off-by: David Howells <dhowells@redhat.com>
__fscache_write_page() attempts to load the radix tree preallocation pool for
the CPU it is on before calling radix_tree_insert(), as the insertion must be
done inside a pair of spinlocks.
Use of the preallocation pool, however, is contingent on the radix tree being
initialised without __GFP_WAIT specified. __fscache_acquire_cookie() was
passing GFP_NOFS to INIT_RADIX_TREE() - but that includes __GFP_WAIT.
The solution is to AND out __GFP_WAIT.
Additionally, the banner comment to radix_tree_preload() is altered to make
note of this prerequisite. Possibly there should be a WARN_ON() too.
Without this fix, I have seen the following recursive deadlock caused by
radix_tree_insert() attempting to allocate memory inside the spinlocked
region, which resulted in FS-Cache being called back into to release memory -
which required the spinlock already held.
=============================================
[ INFO: possible recursive locking detected ]
2.6.32-rc6-cachefs #24
---------------------------------------------
nfsiod/7916 is trying to acquire lock:
(&cookie->lock){+.+.-.}, at: [<ffffffffa0076872>] __fscache_uncache_page+0xdb/0x160 [fscache]
but task is already holding lock:
(&cookie->lock){+.+.-.}, at: [<ffffffffa0076acc>] __fscache_write_page+0x15c/0x3f3 [fscache]
other info that might help us debug this:
5 locks held by nfsiod/7916:
#0: (nfsiod){+.+.+.}, at: [<ffffffff81048290>] worker_thread+0x19a/0x2e2
#1: (&task->u.tk_work#2){+.+.+.}, at: [<ffffffff81048290>] worker_thread+0x19a/0x2e2
#2: (&cookie->lock){+.+.-.}, at: [<ffffffffa0076acc>] __fscache_write_page+0x15c/0x3f3 [fscache]
#3: (&object->lock#2){+.+.-.}, at: [<ffffffffa0076b07>] __fscache_write_page+0x197/0x3f3 [fscache]
#4: (&cookie->stores_lock){+.+...}, at: [<ffffffffa0076b0f>] __fscache_write_page+0x19f/0x3f3 [fscache]
stack backtrace:
Pid: 7916, comm: nfsiod Not tainted 2.6.32-rc6-cachefs #24
Call Trace:
[<ffffffff8105ac7f>] __lock_acquire+0x1649/0x16e3
[<ffffffff81059ded>] ? __lock_acquire+0x7b7/0x16e3
[<ffffffff8100e27d>] ? dump_trace+0x248/0x257
[<ffffffff8105ad70>] lock_acquire+0x57/0x6d
[<ffffffffa0076872>] ? __fscache_uncache_page+0xdb/0x160 [fscache]
[<ffffffff8135467c>] _spin_lock+0x2c/0x3b
[<ffffffffa0076872>] ? __fscache_uncache_page+0xdb/0x160 [fscache]
[<ffffffffa0076872>] __fscache_uncache_page+0xdb/0x160 [fscache]
[<ffffffffa0077eb7>] ? __fscache_check_page_write+0x0/0x71 [fscache]
[<ffffffffa00b4755>] nfs_fscache_release_page+0x86/0xc4 [nfs]
[<ffffffffa00907f0>] nfs_release_page+0x3c/0x41 [nfs]
[<ffffffff81087ffb>] try_to_release_page+0x32/0x3b
[<ffffffff81092c2b>] shrink_page_list+0x316/0x4ac
[<ffffffff81058a9b>] ? mark_held_locks+0x52/0x70
[<ffffffff8135451b>] ? _spin_unlock_irq+0x2b/0x31
[<ffffffff81093153>] shrink_inactive_list+0x392/0x67c
[<ffffffff81058a9b>] ? mark_held_locks+0x52/0x70
[<ffffffff810934ca>] shrink_list+0x8d/0x8f
[<ffffffff81093744>] shrink_zone+0x278/0x33c
[<ffffffff81052c70>] ? ktime_get_ts+0xad/0xba
[<ffffffff8109453b>] try_to_free_pages+0x22e/0x392
[<ffffffff8109184c>] ? isolate_pages_global+0x0/0x212
[<ffffffff8108e16b>] __alloc_pages_nodemask+0x3dc/0x5cf
[<ffffffff810ae24a>] cache_alloc_refill+0x34d/0x6c1
[<ffffffff811bcf74>] ? radix_tree_node_alloc+0x52/0x5c
[<ffffffff810ae929>] kmem_cache_alloc+0xb2/0x118
[<ffffffff811bcf74>] radix_tree_node_alloc+0x52/0x5c
[<ffffffff811bcfd5>] radix_tree_insert+0x57/0x19c
[<ffffffffa0076b53>] __fscache_write_page+0x1e3/0x3f3 [fscache]
[<ffffffffa00b4248>] __nfs_readpage_to_fscache+0x58/0x11e [nfs]
[<ffffffffa009bb77>] nfs_readpage_release+0x34/0x9b [nfs]
[<ffffffffa009c0d9>] nfs_readpage_release_full+0x32/0x4b [nfs]
[<ffffffffa0006cff>] rpc_release_calldata+0x12/0x14 [sunrpc]
[<ffffffffa0006e2d>] rpc_free_task+0x59/0x61 [sunrpc]
[<ffffffffa0006f03>] rpc_async_release+0x10/0x12 [sunrpc]
[<ffffffff810482e5>] worker_thread+0x1ef/0x2e2
[<ffffffff81048290>] ? worker_thread+0x19a/0x2e2
[<ffffffff81352433>] ? thread_return+0x3e/0x101
[<ffffffffa0006ef3>] ? rpc_async_release+0x0/0x12 [sunrpc]
[<ffffffff8104bff5>] ? autoremove_wake_function+0x0/0x34
[<ffffffff81058d25>] ? trace_hardirqs_on+0xd/0xf
[<ffffffff810480f6>] ? worker_thread+0x0/0x2e2
[<ffffffff8104bd21>] kthread+0x7a/0x82
[<ffffffff8100beda>] child_rip+0xa/0x20
[<ffffffff8100b87c>] ? restore_args+0x0/0x30
[<ffffffff8104c2b9>] ? add_wait_queue+0x15/0x44
[<ffffffff8104bca7>] ? kthread+0x0/0x82
[<ffffffff8100bed0>] ? child_rip+0x0/0x20
Signed-off-by: David Howells <dhowells@redhat.com>
Doing the strcmp return value as
signed char __res = *cs - *ct;
is wrong for two reasons. The subtraction can overflow because __res
doesn't use a type big enough. Moreover the compared bytes should be
interpreted as unsigned char as specified by POSIX.
The same problem is fixed in strncmp.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Michael Buesch <mb@bu3sch.de>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, fs: Fix x86 procfs stack information for threads on 64-bit
x86: Add reboot quirk for 3 series Mac mini
x86: Fix printk message typo in mtrr cleanup code
dma-debug: Fix compile warning with PAE enabled
x86/amd-iommu: Un__init function required on shutdown
x86/amd-iommu: Workaround for erratum 63
When PAE is enabled in the kernel configuration the size of
phys_addr_t differs from the size of a void pointer. The gcc
prints a warning about that in dma-debug code.
This patch fixes the warning by converting the output to
unsigned long long instead of a pointer.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
We don't need an explicit PPC64 in the DEBUG_PREEMPT dependancies as all
PPC platforms now support TRACE_IRQFLAGS_SUPPORT.
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
After m68k's task_thread_info() doesn't refer to current,
it's possible to remove sched.h from interrupt.h and not break m68k!
Many thanks to Heiko Carstens for allowing this.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Also increase the maximum possible kmemleak early log entries since
2000 are not sufficient on s390.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When using %*s, sscanf should honor conversion specifiers immediately
following the %*s. For example, the following code should find the
position of the end of the string "hello".
int end;
char buf[] = "hello world";
sscanf(buf, "%*s%n", &end);
printf("%d\n", end);
Ideally, sscanf would advance the fmt and str pointers the same as it
would without the *, but the code for that is rather complicated and is
not included in the patch.
Signed-off-by: Andy Spencer <andy753421@gmail.com>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If the lzma/gzip decompressors are called with insufficient input data
(len > 0 & fill = NULL), they will attempt to call the fill function to
obtain more data, leading to a kernel oops.
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-next: (30 commits)
Use macros for .data.page_aligned section.
Use macros for .bss.page_aligned section.
Use new __init_task_data macro in arch init_task.c files.
kbuild: Don't define ALIGN and ENTRY when preprocessing linker scripts.
arm, cris, mips, sparc, powerpc, um, xtensa: fix build with bash 4.0
kbuild: add static to prototypes
kbuild: fail build if recordmcount.pl fails
kbuild: set -fconserve-stack option for gcc 4.5
kbuild: echo the record_mcount command
gconfig: disable "typeahead find" search in treeviews
kbuild: fix cc1 options check to ensure we do not use -fPIC when compiling
checkincludes.pl: add option to remove duplicates in place
markup_oops: use modinfo to avoid confusion with underscored module names
checkincludes.pl: provide usage helper
checkincludes.pl: close file as soon as we're done with it
ctags: usability fix
kernel hacking: move STRIP_ASM_SYMS from General
gitignore usr/initramfs_data.cpio.bz2 and usr/initramfs_data.cpio.lzma
kbuild: Check if linker supports the -X option
kbuild: introduce ld-option
...
Fix trivial conflict in scripts/basic/fixdep.c
Jens Rosenboom noticed that a possibly unaligned const char*
is cast to a const struct in6_addr *.
Avoid this at the cost of a struct in6_addr copy on the stack.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (34 commits)
trivial: fix typo in aic7xxx comment
trivial: fix comment typo in drivers/ata/pata_hpt37x.c
trivial: typo in kernel-parameters.txt
trivial: fix typo in tracing documentation
trivial: add __init/__exit macros in drivers/gpio/bt8xxgpio.c
trivial: add __init macro/ fix of __exit macro location in ipmi_poweroff.c
trivial: remove unnecessary semicolons
trivial: Fix duplicated word "options" in comment
trivial: kbuild: remove extraneous blank line after declaration of usage()
trivial: improve help text for mm debug config options
trivial: doc: hpfall: accept disk device to unload as argument
trivial: doc: hpfall: reduce risk that hpfall can do harm
trivial: SubmittingPatches: Fix reference to renumbered step
trivial: fix typos "man[ae]g?ment" -> "management"
trivial: media/video/cx88: add __init/__exit macros to cx88 drivers
trivial: fix typo in CONFIG_DEBUG_FS in gcov doc
trivial: fix missing printk space in amd_k7_smp_check
trivial: fix typo s/ketymap/keymap/ in comment
trivial: fix typo "to to" in multiple files
trivial: fix typos in comments s/DGBU/DBGU/
...
FLEX_ARRAY_INIT(element_size, total_nr_elements) cannot determine if
either parameter is valid, so flex arrays which are statically allocated
with this interface can easily become corrupted or reference beyond its
allocated memory.
This removes FLEX_ARRAY_INIT() as a struct flex_array initializer since no
initializer may perform the required checking. Instead, the array is now
defined with a new interface:
DEFINE_FLEX_ARRAY(name, element_size, total_nr_elements)
This may be prefixed with `static' for file scope.
This interface includes compile-time checking of the parameters to ensure
they are valid. Since the validity of both element_size and
total_nr_elements depend on FLEX_ARRAY_BASE_SIZE and FLEX_ARRAY_PART_SIZE,
the kernel build will fail if either of these predefined values changes
such that the array parameters are no longer valid.
Since BUILD_BUG_ON() requires compile time constants, several of the
static inline functions that were once local to lib/flex_array.c had to be
moved to include/linux/flex_array.h.
Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>