linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
KAMEZAWA Hiroyuki	e00e431612	memcg: fix wrong pointer initialization at page migration when memcg is disabled. Lee Schermerhorn reported that he saw bad pointer dereference in mem_cgroup_end_migration() when he disabled memcg by boot option. memcg's page migration logic works as mem_cgroup_prepare_migration(page, &ptr); do page migration mem_cgroup_end_migration(page, ptr); Now, ptr is not initialized in prepare_migration when memcg is disabled by boot option. This causes panic in end_migration. This patch fixes it. Reported-by: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-12 07:25:56 -08:00
Mel Gorman	9d0ed60fe9	page allocator: Do not allow interrupts to use ALLOC_HARDER Commit `341ce06f69` ("page allocator: calculate the alloc_flags for allocation only once") altered watermark logic slightly by allowing rt_tasks that are handling an interrupt to set ALLOC_HARDER. This patch brings the watermark logic more in line with 2.6.30. This change results in a reduction of the number high-order GFP_ATOMIC allocation failures reported. See http://www.gossamer-threads.com/lists/linux/kernel/1144153 [rientjes@google.com: Spotted the problem] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-12 07:25:56 -08:00
Mel Gorman	cc4a685146	page allocator: always wake kswapd when restarting an allocation attempt after direct reclaim failed If a direct reclaim makes no forward progress, it considers whether it should go OOM or not. Whether OOM is triggered or not, it may retry the allocation afterwards. In times past, this would always wake kswapd as well but currently, kswapd is not woken up after direct reclaim fails. For order-0 allocations, this makes little difference but if there is a heavy mix of higher-order allocations that direct reclaim is failing for, it might mean that kswapd is not rewoken for higher orders as much as it did previously. This patch wakes up kswapd when an allocation is being retried after a direct reclaim failure. It would be expected that kswapd is already awake, but this has the effect of telling kswapd to reclaim at the higher order as well. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Reviewed-by: Pekka Enberg <penberg@cs.helsinki.fi> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-12 07:25:56 -08:00
Linus Torvalds	961767b75d	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE highmem: Fix race in debug_kmap_atomic() which could cause warn_count to underflow rcu: Fix long-grace-period race between forcing and initialization uids: Prevent tear down race	2009-11-11 11:30:15 -08:00
Soeren Sandmann	d451564669	highmem: Fix debug_kmap_atomic() to also handle KM_IRQ_PTE, KM_NMI, and KM_NMI_PTE Previously calling debug_kmap_atomic() with these types would cause spurious warnings. (triggered by SysProf using perf events) Signed-off-by: Soeren Sandmann Pedersen <sandmann@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: a.p.zijlstra@chello.nl Cc: <stable@kernel.org> # .31.x LKML-Reference: <ye8vdhz8krw.fsf@camel23.daimi.au.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 04:15:47 +01:00
Soeren Sandmann	5ebd4c2289	highmem: Fix race in debug_kmap_atomic() which could cause warn_count to underflow debug_kmap_atomic() tries to prevent ever printing more than 10 warnings, but it does so by testing whether an unsigned integer is equal to 0. However, if the warning is caused by a nested IRQ, then this counter may underflow and the stream of warnings will never end. Fix that by using a signed integer instead. Signed-off-by: Soeren Sandmann Pedersen <sandmann@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: a.p.zijlstra@chello.nl Cc: <stable@kernel.org> # .31.x LKML-Reference: <ye8zl7b8ktj.fsf@camel23.daimi.au.dk> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-11-10 04:15:32 +01:00
Hugh Dickins	d178f27fc5	ksm: cond_resched in unstable tree KSM needs a cond_resched() for CONFIG_PREEMPT_NONE, in its unbounded search of the unstable tree. The stable tree cases already have one, and originally there was one down inside get_user_pages(); but I missed it when I converted to follow_page() instead. Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Acked-by: Izik Eidus <ieidus@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-09 09:55:44 -08:00
Linus Torvalds	51bb296b09	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: limit coop preemption cfq-iosched: fix bad return value cfq_should_preempt() backing-dev: bdi sb prune should be in the unregister path, not destroy Fix bio_alloc() and bio_kmalloc() documentation bio_put(): add bio_clone() to the list of functions in the comment	2009-11-03 18:16:21 -08:00
Jens Axboe	8c4db3355b	backing-dev: bdi sb prune should be in the unregister path, not destroy Commit `592b09a42f` was different from the tested path, in that it moved the bdi super_block prune from unregister to destroy context. This doesn't fully fix the sync hang bug on unexpected device removal, as need to prune the bdi cache pointer before killing flusher thread. Tested-by: Artur Skawina <art.08.09@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-11-03 20:18:44 +01:00
Bo Liu	32c5fc10e7	mm: remove incorrect swap_count() from try_to_unuse() In try_to_unuse(), swcount is a local copy of swap_map, including the SWAP_HAS_CACHE bit; but a wrong comparison against swap_count(swap_map), which masks off the SWAP_HAS_CACHE bit, succeeded where it should fail. That had the effect of resetting the mm from which to start searching for the next swap page, to an irrelevant mm instead of to an mm in which this swap page had been found: which may increase search time by ~20%. But we're used to swapoff being slow, so never noticed the slowdown. Remove that one spurious use of swap_count(): Bo Liu thought it merely redundant, Hugh rewrote the description since it was measurably wrong. Signed-off-by: Bo Liu <bo-liu@hotmail.com> Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-11-02 09:44:41 -08:00
David Howells	89a8640279	NOMMU: Don't pass NULL pointers to fput() in do_mmap_pgoff() Don't pass NULL pointers to fput() in the error handling paths of the NOMMU do_mmap_pgoff() as it can't handle it. The following can be used as a test program: int main() { static long long a[1024 * 1024 * 20] = { 0 }; return a;} Without the patch, the code oopses in atomic_long_dec_and_test() as called by fput() after the kernel complains that it can't allocate that big a chunk of memory. With the patch, the kernel just complains about the allocation size and then the program segfaults during execve() as execve() can't complete the allocation of all the new ELF program segments. Reported-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Robin Getz <rgetz@blackfin.uclinux.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-31 12:11:37 -07:00
Linus Torvalds	8633322c5f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: sched: move rq_weight data array out of .percpu percpu: allow pcpu_alloc() to be called with IRQs off	2009-10-29 09:19:29 -07:00
Linus Torvalds	68e71d1902	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block * 'for-linus' of git://git.kernel.dk/linux-2.6-block: backing-dev: ensure that a removed bdi no longer has super_block referencing it block: use after free bug in __blkdev_get block: silently error unsupported empty barriers too	2009-10-29 09:17:19 -07:00
Linus Torvalds	0a53f1693c	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/ppc64: Use preempt_schedule_irq instead of preempt_schedule powerpc: Minor cleanup to lib/Kconfig.debug powerpc: Minor cleanup to sound/ppc/Kconfig powerpc: Minor cleanup to init/Kconfig powerpc: Limit memory hotplug support to PPC64 Book-3S machines powerpc: Limit hugetlbfs support to PPC64 Book-3S machines powerpc: Fix compile errors found by new ppc64e_defconfig powerpc: Add a Book-3E 64-bit defconfig powerpc/booke: Fix xmon single step on PowerPC Book-E powerpc: Align vDSO base address powerpc: Fix segment mapping in vdso32 powerpc/iseries: Remove compiler version dependent hack powerpc/perf_events: Fix priority of MSR HV vs PR bits powerpc/5200: Update defconfigs drivers/serial/mpc52xx_uart.c: Use UPIO_MEM rather than SERIAL_IO_MEM powerpc/boot/dts: drop obsolete 'fsl5200-clocking' of: Remove nested function mpc5200: support for the MAN mpc5200 based board mucmc52 mpc5200: support for the MAN mpc5200 based board uc101	2009-10-29 08:59:06 -07:00
Linus Torvalds	3242f9804b	Merge branch 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 * 'hwpoison-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: HWPOISON: fix invalid page count in printk output HWPOISON: Allow schedule_on_each_cpu() from keventd HWPOISON: fix/proc/meminfo alignment HWPOISON: fix oops on ksm pages HWPOISON: Fix page count leak in hwpoison late kill in do_swap_page HWPOISON: return early on non-LRU pages HWPOISON: Add brief hwpoison description to Documentation HWPOISON: Clean up PR_MCE_KILL interface	2009-10-29 08:20:00 -07:00
Daisuke Nishimura	c36987e2ef	mm: don't call pte_unmap() against an improper pte There are some places where we do like: pte = pte_map(); do { (do break in some conditions) } while (pte++, ...); pte_unmap(pte - 1); But if the loop breaks at the first loop, pte_unmap() unmaps invalid pte. This patch is a fix for this problem. Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Reviewd-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:32 -07:00
Russell King	1a83e175dc	mm: fix sparsemem configuration Currently, sparsemem is only available if EXPERIMENTAL is enabled. However, it hasn't ever been marked experimental. It's been about four years since sparsemem was merged, and we have platforms which depend on it; allow architectures to decide whether sparsemem should be the default memory model. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:31 -07:00
Johannes Weiner	6a7b95481d	vmscan: order evictable rescue in LRU putback Isolators putting a page back to the LRU do not hold the page lock, and if the page is mlocked, another thread might munlock it concurrently. Expecting this, the putback code re-checks the evictability of a page when it just moved it to the unevictable list in order to correct its decision. The problem, however, is that ordering is not garuanteed between setting PG_lru when moving the page to the list and checking PG_mlocked afterwards: #0: #1 spin_lock() if (TestClearPageMlocked()) if (PageLRU()) move to evictable list SetPageLRU() spin_unlock() if (!PageMlocked()) move to evictable list The PageMlocked() check may get reordered before SetPageLRU() in #0, resulting in #0 not moving the still mlocked page, and in #1 failing to isolate and move the page as well. The page is now stranded on the unevictable list. The race condition is very unlikely. The consequence currently is one page falling off the reclaim grid and eventually getting freed with PG_unevictable set, which triggers a warning in the page allocator. TestClearPageMlocked() in #1 already provides full memory barrier semantics. This patch adds an explicit full barrier to force ordering between SetPageLRU() and PageMlocked() so that either one of the competitors rescues the page. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:30 -07:00
KOSAKI Motohiro	b05ca7385a	do_mbind(): fix memory leak If migrate_prep is failed, new variable is leaked. This patch fixes it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:29 -07:00
KOSAKI Motohiro	ab8a3e14e6	mbind(): fix leak of never putback pages If mbind() receives an invalid address, do_mbind leaks a page. The following test program detects this leak. This patch fixes it. migrate_efault.c ======================================= #include <numaif.h> #include <numa.h> #include <sys/mman.h> #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <string.h> static unsigned long pagesize; static void* make_hole_mapping(void) { void* addr; addr = mmap(NULL, pagesize3, PROT_READ\|PROT_WRITE, MAP_ANON\|MAP_PRIVATE, 0, 0); if (addr == MAP_FAILED) return NULL; / make page populate / memset(addr, 0, pagesize3); /* make memory hole / munmap(addr+pagesize, pagesize); return addr; } int main(int argc, char* argv) { void* addr; int ch; int node; struct bitmask nmask = numa_allocate_nodemask(); int err; int node_set = 0; while ((ch = getopt(argc, argv, "n:")) != -1){ switch (ch){ case 'n': node = strtol(optarg, NULL, 0); numa_bitmask_setbit(nmask, node); node_set = 1; break; default: ; } } argc -= optind; argv += optind; if (!node_set) numa_bitmask_setbit(nmask, 0); pagesize = getpagesize(); addr = make_hole_mapping(); err = mbind(addr, pagesize3, MPOL_BIND, nmask->maskp, nmask->size, MPOL_MF_MOVE_ALL); if (err) perror("mbind "); return 0; } ======================================= Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:29 -07:00
Wu Fengguang	41e20983fe	vmscan: limit VM_EXEC protection to file pages It is possible to have !Anon but SwapBacked pages, and some apps could create huge number of such pages with MAP_SHARED\|MAP_ANONYMOUS. These pages go into the ANON lru list, and hence shall not be protected: we only care mapped executable files. Failing to do so may trigger OOM. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Rik van Riel <riel@redhat.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:27 -07:00
Andrew Morton	b76146ed1a	revert "mm: oom analysis: add buffer cache information to show_free_areas()" Revert commit `71de1ccbe1` Author: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> AuthorDate: Mon Sep 21 17:01:31 2009 -0700 Commit: Linus Torvalds <torvalds@linux-foundation.org> CommitDate: Tue Sep 22 07:17:27 2009 -0700 mm: oom analysis: add buffer cache information to show_free_areas() show_free_areas() is called during page allocation failures, and page allocation failures can occur in any calling context. But nr_blockdev_pages() takes VFS locks which should not be taken from hard IRQ context (at least). The result is lockdep warnings (and deadlockability) during page allocation failures. Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Cc: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:27 -07:00
KOSAKI Motohiro	58355c7876	congestion_wait(): don't use WRITE commit `8aa7e847d` (Fix congestion_wait() sync/async vs read/write confusion) replace WRITE with BLK_RW_ASYNC. Unfortunately, concurrent mm development made the unchanged place accidentally. This patch fixes it too. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:25 -07:00
Hugh Dickins	92f7ba70ee	hwpoison: fix oops on ksm pages Memory failure on a KSM page currently oopses on its NULL anon_vma in page_lock_anon_vma(): that may not be much worse than the consequence of ignoring it, but it is better to be consistent with how ZERO_PAGE and hugetlb pages and other awkward cases are treated. Just skip it. We could fix it for 2.6.32 at the KSM end, by putting a dummy anon_vma pointer in there; but that would get harder next time, when KSM will put a pointer to something else there (and I'm not currently planning to do any work to open that up to memory_failure). So I would prefer this simple PageKsm test, until the other exceptions are handled. Signed-off-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-10-29 07:39:24 -07:00
Jens Axboe	592b09a42f	backing-dev: ensure that a removed bdi no longer has super_block referencing it When the bdi is being removed, we have to ensure that no super_blocks currently have that cached in sb->s_bdi. Normally this is ensured by the sb having a longer life span than the bdi, but if the device is suddenly yanked, we have to kill this reference. sb->s_bdi is pointed to freed memory at that point. This fixes a problem with sync(1) hanging when a USB stick is pulled without cleanly umounting it first. Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-29 11:46:12 +01:00

1 2 3 4 5 ...

3598 Commits