Commit Graph

174748 Commits

Author SHA1 Message Date
Roland Dreier 4beb3d6d14 x86: Don't use POSIX character classes in gen-insn-attr-x86.awk
Not all awk implementations (including the default awk in Ubuntu 9.10)
support POSIX character classes.  Since x86-opcode-map.txt is plain
ASCII, we can just use explicit ranges for lower case, alphabetic, and
alphanumeric characters instead.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <adabphy750b.fsf@roland-alpha.cisco.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-17 07:03:21 -08:00
H. Peter Anvin c051346b7d Makefile: set LC_CTYPE, LC_COLLATE, LC_NUMERIC to C
There are a number of common Unix constructs like character ranges in
grep/sed/awk which don't work as expected with LC_COLLATE set to other
than C.  Similarly, set LC_CTYPE and LC_NUMERIC to C to avoid other
nasty surprises.

In order to make sure these actually take effect we also have to
clear LC_ALL.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Acked-by: Michal Marek <mmarek@sues.cz>
Acked-by: Masami Hiramatsu <mhiramat@redhat.com>
Acked-by: Roland Dreier <rdreier@cisco.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <4B2A1761.4070904@suse.cz>
2009-12-17 07:03:21 -08:00
Yinghai Lu 6a1e008a09 x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA
Due to recent changes wakeup and mptable, we run out of early
reservations on 32-bit NUMA.  Thus, adjust the available number.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B22D754.2020706@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:46:23 -08:00
Yinghai Lu 3299625036 x86: Fix checking of SRAT when node 0 ram is not from 0
Found one system that boot from socket1 instead of socket0, SRAT get rejected...

[    0.000000] SRAT: Node 1 PXM 0 0-a0000
[    0.000000] SRAT: Node 1 PXM 0 100000-80000000
[    0.000000] SRAT: Node 1 PXM 0 100000000-2080000000
[    0.000000] SRAT: Node 0 PXM 1 2080000000-4080000000
[    0.000000] SRAT: Node 2 PXM 2 4080000000-6080000000
[    0.000000] SRAT: Node 3 PXM 3 6080000000-8080000000
[    0.000000] SRAT: Node 4 PXM 4 8080000000-a080000000
[    0.000000] SRAT: Node 5 PXM 5 a080000000-c080000000
[    0.000000] SRAT: Node 6 PXM 6 c080000000-e080000000
[    0.000000] SRAT: Node 7 PXM 7 e080000000-10080000000
...
[    0.000000] NUMA: Allocated memnodemap from 500000 - 701040
[    0.000000] NUMA: Using 20 for the hash shift.
[    0.000000] Adding active range (0, 0x2080000, 0x4080000) 0 entries of 3200 used
[    0.000000] Adding active range (1, 0x0, 0x96) 1 entries of 3200 used
[    0.000000] Adding active range (1, 0x100, 0x7f750) 2 entries of 3200 used
[    0.000000] Adding active range (1, 0x100000, 0x2080000) 3 entries of 3200 used
[    0.000000] Adding active range (2, 0x4080000, 0x6080000) 4 entries of 3200 used
[    0.000000] Adding active range (3, 0x6080000, 0x8080000) 5 entries of 3200 used
[    0.000000] Adding active range (4, 0x8080000, 0xa080000) 6 entries of 3200 used
[    0.000000] Adding active range (5, 0xa080000, 0xc080000) 7 entries of 3200 used
[    0.000000] Adding active range (6, 0xc080000, 0xe080000) 8 entries of 3200 used
[    0.000000] Adding active range (7, 0xe080000, 0x10080000) 9 entries of 3200 used
[    0.000000] SRAT: PXMs only cover 917504MB of your 1048566MB e820 RAM. Not used.
[    0.000000] SRAT: SRAT not used.

the early_node_map is not sorted because node0 with non zero start come first.

so try to sort it right away after all regions are registered.

also fixs refression by 8716273c (x86: Export srat physical topology)

-v2: make it more solid to handle cross node case like node0 [0,4g), [8,12g) and node1 [4g, 8g), [12g, 16g)
-v3: update comments.

Reported-and-tested-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4B2579D2.3010201@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:43:37 -08:00
Suresh Siddha 45a94d7cd4 x86, cpuid: Add "volatile" to asm in native_cpuid()
xsave_cntxt_init() does something like:

	cpuid(0xd, ..);	// find out what features FP/SSE/.. etc are supported

	xsetbv();	// enable the features known to OS

	cpuid(0xd, ..);	// find out the size of the context for features enabled

Depending on what features get enabled in xsetbv(), value of the
cpuid.eax=0xd.ecx=0.ebx changes correspondingly (representing the
size of the context that is enabled).

As we don't have volatile keyword for native_cpuid(), gcc 4.1.2
optimizes away the second cpuid and the kernel continues to use
the cpuid information obtained before xsetbv(), ultimately leading to kernel
crash on processors supporting more state than the legacy FP/SSE.

Add "volatile" for native_cpuid().

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1261009542.2745.55.camel@sbs-t61.sc.intel.com>
Cc: stable@kernel.org
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 16:30:57 -08:00
Borislav Petkov 6ede31e030 x86, msr: msrs_alloc/free for CONFIG_SMP=n
Randy Dunlap reported the following build error:

"When CONFIG_SMP=n, CONFIG_X86_MSR=m:

ERROR: "msrs_free" [drivers/edac/amd64_edac_mod.ko] undefined!
ERROR: "msrs_alloc" [drivers/edac/amd64_edac_mod.ko] undefined!"

This is due to the fact that <arch/x86/lib/msr.c> is conditioned on
CONFIG_SMP and in the UP case we have only the stubs in the header.
Fork off SMP functionality into a new file (msr-smp.c) and build
msrs_{alloc,free} unconditionally.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Borislav Petkov <petkovbb@gmail.com>
LKML-Reference: <20091216231625.GD27228@liondog.tnic>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 15:36:32 -08:00
Andreas Herrmann 9d260ebc09 x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space
Use NodeId MSR to get NodeId and number of nodes per processor.

Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
LKML-Reference: <20091216144355.GB28798@alberich.amd.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-16 15:06:23 -08:00
Sheng Yang 5df974009f x86: Add IA32_TSC_AUX MSR and use it
Clean up write_tsc() and write_tscp_aux() by replacing
hardcoded values.

No change in functionality.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
LKML-Reference: <1260942485-19156-4-git-send-email-sheng@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-16 09:02:42 +01:00
H. Peter Anvin 0b962d473a x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers
register_chrdev() hardcodes registering 256 minors, presumably to
avoid breaking old drivers.  However, we need to register enough
minors so that we have all possible CPUs.

checkpatch warns on this patch, but the patch is correct: NR_CPUS here
is a static *upper bound* on the *maximum CPU index* (not *number of
CPUs!*) and that is what we want.

Reported-and-tested-by: Russ Anderson <rja@sgi.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <tip-*@git.kernel.org>
2009-12-15 15:13:07 -08:00
Phillip Lougher 54291362d2 initramfs: add missing decompressor error check
The decompressors return error by calling a supplied error function, and/or
by returning an error return value.  The initramfs code, however, fails to
check the exit code returned by the decompressor, and only checks the error
status set by calling the error function.

This patch adds a return code check and calls the error function.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
LKML-Reference: <4b26b1ef.0+ZWxT6886olqcSc%phillip@lougher.demon.co.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-15 14:04:24 -08:00
Phillip Lougher d4529862ca bzip2: Add missing checks for malloc returning NULL
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
LKML-Reference: <4b26b1ef.ln20bM9Mn4gzB21L%phillip@lougher.demon.co.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-15 14:04:19 -08:00
Phillip Lougher c1e7c3ae59 bzip2/lzma/gzip: pre-boot malloc doesn't return NULL on failure
The trivial malloc implementation used in the pre-boot environment by the
decompressors returns a bad pointer on failure (falling through after
calling error).  This is doubly wrong - the callers expect malloc to
return NULL on failure, second the error function is intended to be
used by the decompressors to propagate errors to *their* callers.  The
decompressors have no access to any state set by the error function.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
LKML-Reference: <4b26b1ef.hIInb2AYPMtImAJO%phillip@lougher.demon.co.uk>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-15 14:04:12 -08:00
Jonathan Nieder 23637568ad x86: Fix kprobes build with non-gawk awk
The instruction attribute table generator fails when run by mawk
or original-awk:

 $ mawk -f arch/x86/tools/gen-insn-attr-x86.awk \
	arch/x86/lib/x86-opcode-map.txt > /dev/null
 Semantic error at 240: Second IMM error
 $ echo $?
 1

Line 240 contains "c8: ENTER Iw,Ib", which indicates that this
instruction has two immediate operands, the second of which is
one byte.  The script loops through the immediate operands using
a for loop.

Unfortunately, there is no guarantee in awk that a for (variable
in array) loop will return the indices in increasing order.
Internally, both original-awk and mawk iterate over a hash table
for this purpose, and both implementations happen to produce the
index 2 before 1.  The supposed second immediate operand is more
than one byte wide, producing the error.

So loop over the indices in increasing order instead.  As a
side-effect, with mawk this means the silly two-entry hash table
never has to be built.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jim Keniston <jkenisto@us.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20091213220437.GA27718@progeny.tock>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:35:49 +01:00
Ingo Molnar bf08b3b1a1 Merge branch 'x86/mce' into x86/urgent
Merge reason: Leftover mini-topic from the merge window - merge it.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:33:53 +01:00
Ingo Molnar ab1eebe77d Merge branch 'x86/asm' into x86/urgent
Merge reason: it's stable so lets push it upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 20:33:28 +01:00
FUJITA Tomonori 186a25026c x86: Split swiotlb initialization into two stages
The commit f4780ca005 moves
swiotlb initialization before dma32_free_bootmem(). It's
supposed to fix a bug that the commit
75f1cdf1dd introduced, we
initialize SWIOTLB right after dma32_free_bootmem so we wrongly
steal memory area allocated for GART with broken BIOS earlier.

However, the above commit introduced another problem, which
likely breaks machines with huge amount of memory. Such a box
use the majority of DMA32_ZONE so there is no memory for
swiotlb.

With this patch, the x86 IOMMU initialization sequence are:

1. We set swiotlb to 1 in the case of (max_pfn > MAX_DMA32_PFN
   && !no_iommu). If swiotlb usage is forced by the boot option,
   we go to the step 3 and finish (we don't try to detect IOMMUs).

2. We call the detection functions of all the IOMMUs. The
   detection function sets x86_init.iommu.iommu_init to the IOMMU
   initialization function (so we can avoid calling the
   initialization functions of all the IOMMUs needlessly).

3. We initialize swiotlb (and set dma_ops to swiotlb_dma_ops) if
   swiotlb is set to 1.

4. If the IOMMU initialization function doesn't need swiotlb
   (e.g. the initialization is sucessful) then sets swiotlb to zero.

5. If we find that swiotlb is set to zero, we free swiotlb
   resource.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <20091215204729A.fujita.tomonori@lab.ntt.co.jp>
Tested-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-15 13:01:57 +01:00
H. Peter Anvin 873b5271f8 x86: Regex support and known-movable symbols for relocs, fix _end
This adds a new category of symbols to the relocs program: symbols
which are known to be relative, even though the linker emits them as
absolute; this is the case for symbols that live in the linker script,
which currently applies to _end.

Unfortunately the previous workaround of putting _end in its own empty
section was defeated by newer binutils, which remove empty sections
completely.

This patch also changes the symbol matching to use regular expressions
instead of hardcoded C for specific patterns.

This is a decidedly non-minimal patch: a modified version of the
relocs program is used as part of the Syslinux build, and this 	is
basically a backport to Linux of some of those changes; they have
thus been well tested.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
LKML-Reference: <4AF86211.3070103@zytor.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
2009-12-14 13:55:20 -08:00
H. Peter Anvin 494c2ebfb2 x86, msr: Remove incorrect, duplicated code in the MSR driver
The MSR driver would compute the values for cpu and c at declaration,
and then again in the body of the function.  This isn't merely
redundant, but unsafe, since cpu might not refer to a valid CPU at
that point.

Remove the unnecessary and dangerous references in the declarations.
This code now matches the equivalent code in the CPUID driver.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2009-12-14 10:05:05 -08:00
Hidetoshi Seto 70fe440718 x86, mce: Clean up thermal init by introducing intel_thermal_supported()
It looks better to have a common function. No change in functionality.

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
LKML-Reference: <4B25FDDC.407@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
2009-12-14 10:38:41 +01:00
Cyrill Gorcunov 485a2e1973 x86, mce: Thermal monitoring depends on APIC being enabled
Add check if APIC is not disabled since thermal
monitoring depends on it. As only apic gets disabled
we should not try to install "thermal monitor" vector,
print out that thermal monitoring is enabled and etc...

Note that "Intel Correct Machine Check Interrupts" already
has such a check.

Also I decided to not add cpu_has_apic check into
mcheck_intel_therm_init since even if it'll call apic_read on
disabled apic -- it's safe here and allow us to save a few code
bytes.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
LKML-Reference: <4B25FDC2.3020401@jp.fujitsu.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 10:38:41 +01:00
Yinghai Lu f3eee54276 x86: Gart: fix breakage due to IOMMU initialization cleanup
This fixes the following breakage of the commit
75f1cdf1dd:

- GART systems that don't AGP with broken BIOS and more than 4GB
  memory are forced to use swiotlb. They can allocate aperture by
  hand and use GART.

- GART systems without GAP must disable GART on shutdown.

- swiotlb usage is forced by the boot option,
  gart_iommu_hole_init() is not called, so we disable GART
  early_gart_iommu_check().

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
LKML-Reference: <1260759135-6450-3-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:57:40 +01:00
FUJITA Tomonori f4780ca005 x86: Move swiotlb initialization before dma32_free_bootmem
The commit 75f1cdf1dd introduced a
bug that we initialize SWIOTLB right after dma32_free_bootmem so
we wrongly steal memory area allocated for GART with broken BIOS
earlier.

This moves swiotlb initialization before dma32_free_bootmem().

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: yinghai@kernel.org
LKML-Reference: <1260759135-6450-2-git-send-email-fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:57:40 +01:00
Joe Perches eba11d6da7 x86: Fix build warning in arch/x86/mm/mmio-mod.c
Stephen Rothwell reported these warnings:

 arch/x86/mm/mmio-mod.c: In function 'print_pte':
 arch/x86/mm/mmio-mod.c💯 warning: too many arguments for format
 arch/x86/mm/mmio-mod.c:106: warning: too many arguments for format

The 'fmt' was left out accidentally.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Joe Perches <joe@perches.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus <torvalds@linux-foundation.org>
LKML-Reference: <1260775443.18538.16.camel@Joe-Laptop.home>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:55:43 +01:00
FUJITA Tomonori 06f8bda832 x86: Remove usedac in feature-removal-schedule.txt
The reason of removal, "replaced by allowdac and no dac
combination" is incorrect. There is no way to do the same thing
with "allowdac" and "nodac" combination.

The usedac option enables us to stop via_no_dac() setting
forbid_dac to 1. That is, someone who uses VIA bridges can use
DAC with this option even if some of VIA bridges seem to be
broken about DAC.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: WANG Cong <amwang@redhat.com>
Cc: gcosta@redhat.com
LKML-Reference: <20091214104423X.fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-14 08:53:54 +01:00
Cliff Wickman 1d865fb728 x86: Fix duplicated UV BAU interrupt vector
Interrupt vector 0xec has been doubly defined in irq_vectors.h

It seems arbitrary whether LOCAL_PENDING_VECTOR or
UV_BAU_MESSAGE is the higher number.  As long as they are
unique. If they are not unique we'll hit a BUG in
alloc_system_vector().

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
LKML-Reference: <E1NJ9Pe-0004P7-0Q@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-12-13 08:17:40 +01:00