Commit Graph

81 Commits

Author SHA1 Message Date
Johannes Weiner
799f933a82 mm: bootmem: try harder to free pages in bulk
The loop that frees pages to the page allocator while bootstrapping tries
to free higher-order blocks only when the starting address is aligned to
that block size.  Otherwise it will free all pages on that node
one-by-one.

Change it to free individual pages up to the first aligned block and then
try higher-order frees from there.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:45 -08:00
Johannes Weiner
560a036b3a mm: bootmem: drop superfluous range check when freeing pages in bulk
The area node_bootmem_map represents is aligned to BITS_PER_LONG, and all
bits in any aligned word of that map valid.  When the represented area
extends beyond the end of the node, the non-existant pages will be marked
as reserved.

As a result, when freeing a page block, doing an explicit range check for
whether that block is within the node's range is redundant as the bitmap
is consulted anyway to see whether all pages in the block are unreserved.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:44 -08:00
Uwe Kleine-König
9571a98290 bootmem: micro optimize freeing pages in bulk
The first entry of bdata->node_bootmem_map holds the data for
bdata->node_min_pfn up to bdata->node_min_pfn + BITS_PER_LONG - 1.  So the
test for freeing all pages of a single map entry can be slightly relaxed.

Moreover use DIV_ROUND_UP in another place instead of open coding it.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Johannes Weiner <hannes@saeurebad.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-01-10 16:30:44 -08:00
Paul Gortmaker
b95f1b31b7 mm: Map most files to use export.h instead of module.h
The files changed within are only using the EXPORT_SYMBOL
macro variants.  They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 09:20:12 -04:00
Olaf Hering
93a72052be crash_dump: export is_kdump_kernel to modules, consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state.  To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.

Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.

Leave 'elfcorehdr' as early_param().  This changes powerpc from __setup()
to early_param().  It adds an address range check from x86 also on ia64
and powerpc.

[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-23 19:47:19 -07:00
Yinghai Lu
e782ab421b bootmem: Move contig_page_data definition to bootmem.c/nobootmem.c
Now that bootmem.c and nobootmem.c are separate, it's cleaner to
define contig_page_data in each file than in page_alloc.c with #ifdef.
Move it.

This patch doesn't introduce any behavior change.

-v2: According to Andrew, fixed the struct layout.
-tj: Updated commit description.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-24 14:43:06 +01:00
Yinghai Lu
0932587328 bootmem: Separate out CONFIG_NO_BOOTMEM code into nobootmem.c
mm/bootmem.c contained code paths for both bootmem and no bootmem
configurations.  They implement about the same set of APIs in
different ways and as a result bootmem.c contains massive amount of
#ifdef CONFIG_NO_BOOTMEM.

Separate out CONFIG_NO_BOOTMEM code into mm/nobootmem.c.  As the
common part is relatively small, duplicate them in nobootmem.c instead
of creating a common file or ifdef'ing in bootmem.c.

The followings are duplicated.

* {min|max}_low_pfn, max_pfn, saved_max_pfn
* free_bootmem_late()
* ___alloc_bootmem()
* __alloc_bootmem_low()

The followings are applicable only to nobootmem and moved verbatim.

* __free_pages_memory()
* free_all_memory_core_early()

The followings are not applicable to nobootmem and omitted in
nobootmem.c.

* reserve_bootmem_node()
* reserve_bootmem()

The rest split function bodies according to CONFIG_NO_BOOTMEM.

Makefile is updated so that only either bootmem.c or nobootmem.c is
built according to CONFIG_NO_BOOTMEM.

This patch doesn't introduce any behavior change.

-tj: Rewrote commit description.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-24 14:43:05 +01:00
Yinghai Lu
a9ce6bc151 x86, memblock: Replace e820_/_early string with memblock_
1.include linux/memblock.h directly. so later could reduce e820.h reference.
2 this patch is done by sed scripts mainly

-v2: use MEMBLOCK_ERROR instead of -1ULL or -1UL

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27 11:13:47 -07:00
Yinghai Lu
72d7c3b33c x86: Use memblock to replace early_res
1. replace find_e820_area with memblock_find_in_range
2. replace reserve_early with memblock_x86_reserve_range
3. replace free_early with memblock_x86_free_range.
4. NO_BOOTMEM will switch to use memblock too.
5. use _e820, _early wrap in the patch, in following patch, will
   replace them all
6. because memblock_x86_free_range support partial free, we can remove some special care
7. Need to make sure that memblock_find_in_range() is called after memblock_x86_fill()
   so adjust some calling later in setup.c::setup_arch()
   -- corruption_check and mptable_update

-v2: Move reserve_brk() early
    Before fill_memblock_area, to avoid overlap between brk and memblock_find_in_range()
    that could happen We have more then 128 RAM entry in E820 tables, and
    memblock_x86_fill() could use memblock_find_in_range() to find a new place for
    memblock.memory.region array.
    and We don't need to use extend_brk() after fill_memblock_area()
    So move reserve_brk() early before fill_memblock_area().
-v3: Move find_smp_config early
    To make sure memblock_find_in_range not find wrong place, if BIOS doesn't put mptable
    in right place.
-v4: Treat RESERVED_KERN as RAM in memblock.memory. and they are already in
    memblock.reserved already..
    use __NOT_KEEP_MEMBLOCK to make sure memblock related code could be freed later.
-v5: Generic version __memblock_find_in_range() is going from high to low, and for 32bit
    active_region for 32bit does include high pages
    need to replace the limit with memblock.default_alloc_limit, aka get_max_mapped()
-v6: Use current_limit instead
-v7: check with MEMBLOCK_ERROR instead of -1ULL or -1L
-v8: Set memblock_can_resize early to handle EFI with more RAM entries
-v9: update after kmemleak changes in mainline

Suggested-by: David S. Miller <davem@davemloft.net>
Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27 11:12:29 -07:00
Yinghai Lu
f88eff74aa bootmem, x86: Add weak version of reserve_bootmem_generic
It will be used memblock_x86_to_bootmem converting

It is an wrapper for reserve_bootmem, and x86 64bit is using special one.

Also clean up that version for x86_64. We don't need to take care of numa
path for that, bootmem can handle it how

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-08-27 11:08:13 -07:00
Yinghai Lu
b8ab9f8202 x86,nobootmem: make alloc_bootmem_node fall back to other node when 32bit numa is used
Borislav Petkov reported his 32bit numa system has problem:

[    0.000000] Reserving total of 4c00 pages for numa KVA remap
[    0.000000] kva_start_pfn ~ 32800 max_low_pfn ~ 375fe
[    0.000000] max_pfn = 238000
[    0.000000] 8202MB HIGHMEM available.
[    0.000000] 885MB LOWMEM available.
[    0.000000]   mapped low ram: 0 - 375fe000
[    0.000000]   low ram: 0 - 375fe000
[    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 1000 1000 => 34e7000
[    0.000000] alloc (nid=8 100000 - 7ee00000) (1000000 - ffffffff) 200 40 => 34c9d80
[    0.000000] alloc (nid=0 100000 - 7ee00000) (1000000 - ffffffffffffffff) 180 40 => 34e6140
[    0.000000] alloc (nid=1 80000000 - c7e60000) (1000000 - ffffffffffffffff) 240 40 => 80000000
[    0.000000] BUG: unable to handle kernel paging request at 40000000
[    0.000000] IP: [<c2c8cff1>] __alloc_memory_core_early+0x147/0x1d6
[    0.000000] *pdpt = 0000000000000000 *pde = f000ff53f000ff00
...
[    0.000000] Call Trace:
[    0.000000]  [<c2c8b4f8>] ? __alloc_bootmem_node+0x216/0x22f
[    0.000000]  [<c2c90c9b>] ? sparse_early_usemaps_alloc_node+0x5a/0x10b
[    0.000000]  [<c2c9149e>] ? sparse_init+0x1dc/0x499
[    0.000000]  [<c2c79118>] ? paging_init+0x168/0x1df
[    0.000000]  [<c2c780ff>] ? native_pagetable_setup_start+0xef/0x1bb

looks like it allocates too much high address for bootmem.

Try to cut limit with get_max_mapped()

Reported-by: Borislav Petkov <borislav.petkov@amd.com>
Tested-by: Conny Seidel <conny.seidel@amd.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: <stable@kernel.org>		[2.6.34.x]
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-20 16:25:40 -07:00
Linus Torvalds
fb1ae63577 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip:
  x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards
  x86: Increase CONFIG_NODES_SHIFT max to 10
  ibft, x86: Change reserve_ibft_region() to find_ibft_region()
  x86, hpet: Fix bug in RTC emulation
  x86, hpet: Erratum workaround for read after write of HPET comparator
  bootmem, x86: Fix 32bit numa system without RAM on node 0
  nobootmem, x86: Fix 32bit numa system without RAM on node 0
  x86: Handle overlapping mptables
  x86: Make e820_remove_range to handle all covered case
  x86-32, resume: do a global tlb flush in S4 resume
2010-04-07 11:02:23 -07:00
Yinghai Lu
aa235fc712 bootmem, x86: Fix 32bit numa system without RAM on node 0
When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.

If node 0 doesn't have RAM installed, the lowest populated node
becomes low RAM.

This one fixes BOOTMEM path by iterating over the bdata_list.

-v3: add more comments, and fix bootmem path too.
-v4: seperate from one big patch

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB416D7.6090203@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-01 14:41:19 -07:00
Yinghai Lu
337998587f nobootmem, x86: Fix 32bit numa system without RAM on node 0
On one system without RAM on node0, got following boot dump with a 32
bit NUMA kernel:

early_node_map[4] active PFN ranges
    1: 0x00000010 -> 0x00000099
    1: 0x00000100 -> 0x0007da00
    1: 0x0007e800 -> 0x0007ffa0
    1: 0x0007ffae -> 0x0007ffb0
...
Subtract (29 early reservations)
  #000 [0000001000 - 0000002000]
  #001 [0000089000 - 000008f000]
  #002 [0000091000 - 0000093500]
...
  #027 [007cbfef40 - 007e800000]
  #028 [007e9ca000 - 007ff95000]
(0 free memory ranges)
Initializing HighMem for node 0 (00000000:00000000)
Initializing HighMem for node 1 (00000000:00000000)
Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
...
Checking if this processor honours the WP bit even in supervisor mode...Ok.
swapper: page allocation failure. order:0, mode:0x0
Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
Call Trace:
 [<4087a5dc>] ? printk+0xf/0x11
 [<40286728>] __alloc_pages_nodemask+0x417/0x487
 [<402a9ce1>] new_slab+0xe2/0x1fe
 [<402aa5b2>] kmem_cache_open+0x185/0x358
 [<402abbc0>] T.954+0x1c/0x60
 [<40d52a29>] kmem_cache_init+0x24/0x113
 [<40d39738>] start_kernel+0x166/0x2e4
 [<40d3940e>] ? unknown_bootoption+0x0/0x18e
 [<40d390ce>] i386_start_kernel+0xce/0xd5
Mem-Info:
Node 1 DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
Node 1 Normal per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
active_anon:0 inactive_anon:0 isolated_anon:0
 active_file:0 inactive_file:0 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 free:0 slab_reclaimable:0 slab_unreclaimable:0
 mapped:0 shmem:0 pagetables:0 bounce:0

When 32bit NUMA is used, free_all_bootmem() will still only go over with
node id 0.

If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.

Use MAX_NUMNODES like 64-bit NUMA does.

Note: BOOTMEM path has the same problem.
      this bug exist before We have NO_BOOTMEM support.

-v3: add more comments, and fix bootmem path too.
-v4: seperate bootmem path fix

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB41689.9090502@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-01 14:39:29 -07:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Jiri Kosina
c26f91a3df x86: Remove excessive early_res debug output
Commit 08677214e3 ("x86: Make 64 bit use early_res instead
of bootmem  before slab") introduced early_res replacement for
bootmem, but left code  in __free_pages_memory() which dumps all
the ranges that are beeing freed,  without any additional
information, causing some noise in dmesg during  bootup.

Just remove printing of the ranges, that doesn't provide
anything useful  anyway.

While at it, remove other commented-out KERN_DEBUG messages in
the NO_BOOTMEM code as well.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Found-OK-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <alpine.LNX.2.00.1003220931360.18642@pobox.suse.cz>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-03-24 19:51:08 +01:00
Yinghai Lu
08677214e3 x86: Make 64 bit use early_res instead of bootmem before slab
Finally we can use early_res to replace bootmem for x86_64 now.

Still can use CONFIG_NO_BOOTMEM to enable it or not.

-v2: fix 32bit compiling about MAX_DMA32_PFN
-v3: folded bug fix from LKML message below

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B747239.4070907@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-12 09:41:59 -08:00
Jan Beulich
8aa043d745 mm/bootmem.c: properly __init-annotate helper functions
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-12-15 08:53:20 -08:00
FUJITA Tomonori
9f993ac3f7 bootmem: Add free_bootmem_late()
Add a new function for freeing bootmem after the bootmem
allocator has been released and the unreserved pages given to
the page allocator.

This allows us to reserve bootmem and then release it if we
later discover it was not needed.

( This new API will be used by the swiotlb code to recover
  a significant amount of RAM (64MB). )

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: chrisw@sous-sol.org
Cc: dwmw2@infradead.org
Cc: joerg.roedel@amd.com
Cc: muli@il.ibm.com
Cc: hannes@cmpxchg.org
Cc: tj@kernel.org
Cc: akpm@linux-foundation.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1257849980-22640-7-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-11-10 12:31:43 +01:00
Catalin Marinas
008139d914 kmemleak: Do not report alloc_bootmem blocks as leaks
This patch sets the min_count for alloc_bootmem objects to 0 so that
they are never reported as leaks. This is because many of these blocks
are only referred via the physical address which is not looked up by
kmemleak.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
2009-08-27 14:29:17 +01:00
Catalin Marinas
ec3a354bd4 kmemleak: Add callbacks to the bootmem allocator
This patch adds kmemleak_alloc/free callbacks to the bootmem allocator.
This would allow scanning of such blocks and help avoiding a whole class
of false positives and more kmemleak annotations.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
2009-07-08 14:25:14 +01:00
Joe Perches
433f13a727 bootmem.c: avoid c90 declaration warning
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-06-19 16:46:04 -07:00
Pekka Enberg
c91c4773b3 bootmem: fix slab fallback on numa
If the user requested bootmem allocation on a specific node, we should use
kzalloc_node() for the fallback allocation.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11 19:15:54 +03:00
Pekka Enberg
441c7e0a2e bootmem: use slab if bootmem is no longer available
As a preparation for initializing the slab allocator early, make sure the
bootmem allocator does not crash and burn if someone calls it after slab is up;
otherwise we'd need a flag day for switching to early slab.

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-06-11 19:10:41 +03:00
Tejun Heo
d0c4f57027 bootmem, x86: further fixes for arch-specific bootmem wrapping
Impact: fix new breakages introduced by previous fix

Commit c132937556 tried to clean up
bootmem arch wrapper but it wasn't quite correct.  Before the commit,
the followings were broken.

* Low level interface functions prefixed with __ ignored arch
  preference.

* reserve_bootmem(...) can't be mapped into
  reserve_bootmem_node(NODE_DATA(0)->bdata, ...) because the node is
  not preference here.  The region specified MUST fall into the
  specified region; otherwise, it will panic.

After the commit,

* If allocation fails for the arch preferred node, it should fallback
  to whatever is available.  Instead, it simply failed allocation.

There are too many internal details to allow generic wrapping and
still keep things simple for archs.  Plus, all that arch wants is a
way to prefer certain node over another.

This patch drops the generic wrapping around alloc_bootmem_core() and
add alloc_bootmem_core() instead.  If necessary, arch can define
bootmem_arch_referred_node() macro or function which takes all
allocation information and returns the preferred node.  bootmem
generic code will always try the preferred node first and then
fallback to other nodes as usual.

Breakages noted and changes reviewed by Johannes Weiner.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
2009-03-01 16:06:56 +09:00