Commit Graph

3043 Commits

Author SHA1 Message Date
Eduardo Habkost
8d00450d29 Revert "x86: default to reboot via ACPI"
This reverts commit c7ffa6c262.

the assumptio of this change was that this would not break
any existing machine. Andrey Borzenkov reported troubles with
the ACPI reboot method: the system would hang on reboot, necessiating
a power cycle. Probably more systems are affected as well.

Also, there are patches queued up for v2.6.29 to disable virtualization
on emergency_restart() - which was the original motivation of
this change.

Reported-by: Andrey Borzenkov <arvidjaar@mail.ru>
Bisected-by: Andrey Borzenkov <arvidjaar@mail.ru>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 16:05:06 +01:00
Ingo Molnar
31f297143b Merge branch 'iommu-fixes-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent 2008-11-06 15:23:35 +01:00
Joerg Roedel
80be308dfa AMD IOMMU: fix lazy IO/TLB flushing in unmap path
Lazy flushing needs to take care of the unmap path too which is not yet
implemented and leads to stale IO/TLB entries. This is fixed by this
patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-11-06 14:59:05 +01:00
Suresh Siddha
d6f0f39b7d x86: add smp_mb() before sending INVALIDATE_TLB_VECTOR
Impact: fix rare x2apic hang

On x86, x2apic mode accesses for sending IPI's don't have serializing
semantics. If the IPI receivner refers(in lock-free fashion) to some
memory setup by the sender, the need for smp_mb() before sending the
IPI becomes critical in x2apic mode.

Add the smp_mb() in native_flush_tlb_others() before sending the IPI.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 09:41:49 +01:00
Ben Hutchings
c78d0cf292 x86: don't allow nr_irqs > NR_IRQS
Impact: fix boot hang on 32-bit systems with more than 224 IO-APIC pins

On some 32-bit systems with a lot of IO-APICs probe_nr_irqs() can
return a value larger than NR_IRQS. This will lead to probe_irq_on()
overrunning the irq_desc array.

I hit this when running net-next-2.6 (close to 2.6.28-rc3) on a
Supermicro dual Xeon system.  NR_IRQS is 224 but probe_nr_irqs() detects
5 IOAPICs and returns 240.  Here are the log messages:

Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
Tue Nov  4 16:53:47 2008 IOAPIC[0]: apic_id 1, version 32, address 0xfec00000, GSI 0-23
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x02] address[0xfec81000] gsi_base[24])
Tue Nov  4 16:53:47 2008 IOAPIC[1]: apic_id 2, version 32, address 0xfec81000, GSI 24-47
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x03] address[0xfec81400] gsi_base[48])
Tue Nov  4 16:53:47 2008 IOAPIC[2]: apic_id 3, version 32, address 0xfec81400, GSI 48-71
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x04] address[0xfec82000] gsi_base[72])
Tue Nov  4 16:53:47 2008 IOAPIC[3]: apic_id 4, version 32, address 0xfec82000, GSI 72-95
Tue Nov  4 16:53:47 2008 ACPI: IOAPIC (id[0x05] address[0xfec82400] gsi_base[96])
Tue Nov  4 16:53:47 2008 IOAPIC[4]: apic_id 5, version 32, address 0xfec82400, GSI 96-119
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
Tue Nov  4 16:53:47 2008 ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Tue Nov  4 16:53:47 2008 Enabling APIC mode:  Flat.  Using 5 I/O APICs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Acked-by: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-06 07:23:21 +01:00
Alok Kataria
70de9a9704 x86: don't use tsc_khz to calculate lpj if notsc is passed
Impact: fix udelay when "notsc" boot parameter is passed

With notsc passed on commandline, tsc may not be used for
udelays, make sure that we do not use tsc_khz to calculate
the lpj value in such cases.

Reported-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-04 09:55:26 +01:00
Linus Torvalds
67d1128425 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: fix AMDC1E and XTOPOLOGY conflict in cpufeature
  x86: build fix
2008-11-01 10:36:30 -07:00
Linus Torvalds
1f98757776 x86: Clean up late e820 resource allocation
This makes the late e820 resources use 'insert_resource_expand_to_fit()'
instead of doing a 'reserve_region_with_split()', and also avoids
marking them as IORESOURCE_BUSY.

This results in us being perfectly happy to use pre-existing PCI
resources even if they were marked as being in a reserved region, while
still avoiding any _new_ allocations in the reserved regions.  It also
makes for a simpler and more accurate resource tree.

Example resource allocation from Jonathan Corbet, who has firmware that
has an e820 reserved entry that covered a big range (e0000000-fed003ff),
and that had various PCI resources in it set up by firmware.

With old kernels, the reserved range would force us to re-allocate all
pre-existing PCI resources, and his reserved range would end up looking
like this:

	e0000000-fed003ff : reserved
	  fec00000-fec00fff : IOAPIC 0
	  fed00000-fed003ff : HPET 0

where only the pre-allocated special regions (IOAPIC and HPET) were kept
around.

With 2.6.28-rc2, which uses 'reserve_region_with_split()', Jonathan's
resource tree looked like this:

	e0000000-fe7fffff : reserved
	fe800000-fe8fffff : PCI Bus 0000:01
	 fe800000-fe8fffff : reserved
	fe900000-fe9d9aff : reserved
	fe9d9b00-fe9d9bff : 0000:00:1f.3
	 fe9d9b00-fe9d9bff : reserved
	fe9d9c00-fe9d9fff : 0000:00:1a.7
	 fe9d9c00-fe9d9fff : reserved
	fe9da000-fe9dafff : 0000:00:03.3
	 fe9da000-fe9dafff : reserved
	fe9db000-fe9dbfff : 0000:00:19.0
	 fe9db000-fe9dbfff : reserved
	fe9dc000-fe9dffff : 0000:00:1b.0
	 fe9dc000-fe9dffff : reserved
	fe9e0000-fe9fffff : 0000:00:19.0
	 fe9e0000-fe9fffff : reserved
	fea00000-fea7ffff : 0000:00:02.0
	 fea00000-fea7ffff : reserved
	fea80000-feafffff : 0000:00:02.1
	 fea80000-feafffff : reserved
	feb00000-febfffff : 0000:00:02.0
	 feb00000-febfffff : reserved
	fec00000-fed003ff : reserved
	 fec00000-fec00fff : IOAPIC 0
	 fed00000-fed003ff : HPET 0

and because the reserved entry had been split and moved into the
individual resources, and because it used the IORESOURCE_BUSY flag, the
drivers that actually wanted to _use_ those resources couldn't actually
attach to them:

	e1000e 0000:00:19.0: BAR 0: can't reserve mem region [0xfe9e0000-0xfe9fffff]
	HDA Intel 0000:00:1b.0: BAR 0: can't reserve mem region [0xfe9dc000-0xfe9dffff]

with this patch, the resource tree instead becomes

	e0000000-fed003ff : reserved
	  fe800000-fe8fffff : PCI Bus 0000:01
	  fe9d9b00-fe9d9bff : 0000:00:1f.3
	  fe9d9c00-fe9d9fff : 0000:00:1a.7
	    fe9d9c00-fe9d9fff : ehci_hcd
	  fe9da000-fe9dafff : 0000:00:03.3
	  fe9db000-fe9dbfff : 0000:00:19.0
	    fe9db000-fe9dbfff : e1000e
	  fe9dc000-fe9dffff : 0000:00:1b.0
	    fe9dc000-fe9dffff : ICH HD audio
	  fe9e0000-fe9fffff : 0000:00:19.0
	    fe9e0000-fe9fffff : e1000e
	  fea00000-fea7ffff : 0000:00:02.0
	  fea80000-feafffff : 0000:00:02.1
	  feb00000-febfffff : 0000:00:02.0
	  fec00000-fec00fff : IOAPIC 0
	  fed00000-fed003ff : HPET 0

ie the one reserved region now ends up surrounding all the PCI resources
that were allocated inside of it by firmware, and because it is not
marked BUSY, drivers have no problem attaching to the pre-allocated
resources.

Reported-and-tested-by: Jonathan Corbet <corbet@lwn.net>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-01 10:17:22 -07:00
Ingo Molnar
b342797c1e x86: build fix
Impact: build fix on certain UP configs

fix:

 arch/x86/kernel/cpu/common.c: In function 'cpu_init':
 arch/x86/kernel/cpu/common.c:1141: error: 'boot_cpu_id' undeclared (first use in this function)
 arch/x86/kernel/cpu/common.c:1141: error: (Each undeclared identifier is reported only once
 arch/x86/kernel/cpu/common.c:1141: error: for each function it appears in.)

Pull in asm/smp.h on UP, so that we get the definition of
boot_cpu_id.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 09:31:38 +01:00
Linus Torvalds
74c75f524e Merge branch 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: cpu_index build fix
  x86/voyager: fix missing cpu_index initialisation
  x86/voyager: fix compile breakage caused by dc1e35c6e9
  x86: fix /dev/mem mmap breakage when PAT is disabled
  x86/voyager: fix compile breakage casued by x86: move prefill_possible_map calling early
  x86: use CONFIG_X86_SMP instead of CONFIG_SMP
  x86/voyager: fix boot breakage caused by x86: boot secondary cpus through initial_code
  x86, uv: fix compile error in uv_hub.h
  i386/PAE: fix pud_page()
  x86: remove debug code from arch_add_memory()
  x86: start annotating early ioremap pointers with __iomem
  x86: two trivial sparse annotations
  x86: fix init_memory_mapping for [dc000000 - e0000000) - v2
2008-10-30 18:33:46 -07:00
Ingo Molnar
1c4acdb467 x86: cpu_index build fix
fix:

 arch/x86/kernel/cpu/common.c: In function 'early_identify_cpu':
 arch/x86/kernel/cpu/common.c:553: error: 'struct cpuinfo_x86' has no member named 'cpu_index'

as cpu_index is only available on SMP.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:43:03 +01:00
James Bottomley
bfcb4c1bec x86/voyager: fix missing cpu_index initialisation
Impact: fix /proc/cpuinfo output on x86/Voyager

Ever since

| commit 92cb7612ae
| Author: Mike Travis <travis@sgi.com>
| Date:   Fri Oct 19 20:35:04 2007 +0200
|
|     x86: convert cpuinfo_x86 array to a per_cpu array

We've had an extra field in cpuinfo_x86 which is cpu_index.
Unfortunately, voyager has never initialised this, although the only
noticeable impact seems to be that /proc/cpuinfo shows all zeros for
the processor ids.

Anyway, fix this by initialising the boot CPU properly and setting the
index when the secondaries update.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:37 +01:00
James Bottomley
b3572e361b x86/voyager: fix compile breakage caused by dc1e35c6e9
Impact: build fix on x86/Voyager

Given commits like this:

| Author: Suresh Siddha <suresh.b.siddha@intel.com>
| Date:   Tue Jul 29 10:29:19 2008 -0700
|
|     x86, xsave: enable xsave/xrstor on cpus with xsave support

Which deliberately expose boot cpu dependence to pieces of the system,
I think it's time to explicitly have a variable for it to prevent this
continual misassumption that the boot CPU is zero.

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-31 00:19:33 +01:00
James Bottomley
017d9d20d8 x86: use CONFIG_X86_SMP instead of CONFIG_SMP
Impact: fix x86/Voyager boot

CONFIG_SMP is used for features which work on *all* x86 boxes.
CONFIG_X86_SMP is used for standard PC like x86 boxes (for things like
multi core and apics)

Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-30 22:53:10 +01:00
Linus Torvalds
8bd93ca7b0 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, gart: fix gart detection for Fam11h CPUs
  x86: 64 bit print out absent pages num too
  x86, kdump: fix invalid access on i386 sparsemem
  x86: fix APIC_DEBUG with inquire_remote_apic
  x86: AMD microcode patch loader author update
  x86: microcode patch loader author update
  mailmap: add Peter Oruba
  x86, bts: improve help text for BTS config
  doc/x86: fix doc subdirs
2008-10-30 12:50:59 -07:00
Linus Torvalds
d6c3112abe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  x86/PCI: build failure at x86/kernel/pci-dma.c with !CONFIG_PCI
2008-10-30 12:09:44 -07:00
Joerg Roedel
ae9b940364 AMD IOMMU: fix detection of NP capable IOMMUs
This patch changes the code to use IOMMU_CAP_NPCACHE as a shift and not
as a mask.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2008-10-30 17:50:14 +01:00
Harvey Harrison
9352f5698d x86: two trivial sparse annotations
Impact: fewer sparse warnings, no functional changes

arch/x86/kernel/vsmp_64.c:87:14: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:87:14:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:87:14:    got void *[assigned] address
arch/x86/kernel/vsmp_64.c:88:22: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:88:22:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:88:22:    got void *
arch/x86/kernel/vsmp_64.c:100:23: warning: incorrect type in argument 2 (different address spaces)
arch/x86/kernel/vsmp_64.c:100:23:    expected void volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:100:23:    got void *
arch/x86/kernel/vsmp_64.c:101:23: warning: incorrect type in argument 1 (different address spaces)
arch/x86/kernel/vsmp_64.c:101:23:    expected void const volatile [noderef] <asn:2>*addr
arch/x86/kernel/vsmp_64.c:101:23:    got void *
arch/x86/mm/gup.c:235:6: warning: incorrect type in argument 1 (different base types)
arch/x86/mm/gup.c:235:6:    expected void const volatile [noderef] <asn:1>*<noident>
arch/x86/mm/gup.c:235:6:    got unsigned long [unsigned] [assigned] start

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-29 08:02:28 +01:00
Linus Torvalds
e946217e4f Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
  ftrace: fix current_tracer error return
  tracing: fix a build error on alpha
  ftrace: use a real variable for ftrace_nop in x86
  tracing/ftrace: make boot tracer select the sched_switch tracer
  tracepoint: check if the probe has been registered
  asm-generic: define DIE_OOPS in asm-generic
  trace: fix printk warning for u64
  ftrace: warning in kernel/trace/ftrace.c
  ftrace: fix build failure
  ftrace, powerpc, sparc64, x86: remove notrace from arch ftrace file
  ftrace: remove ftrace hash
  ftrace: remove mcount set
  ftrace: remove daemon
  ftrace: disable dynamic ftrace for all archs that use daemon
  ftrace: add ftrace warn on to disable ftrace
  ftrace: only have ftrace_kill atomic
  ftrace: use probe_kernel
  ftrace: comment arch ftrace code
  ftrace: return error on failed modified text.
  ftrace: dynamic ftrace process only text section
  ...
2008-10-28 09:52:25 -07:00
Linus Torvalds
0d8762c9ee Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  lockdep: fix irqs on/off ip tracing
  lockdep: minor fix for debug_show_all_locks()
  x86: restore the old swiotlb alloc_coherent behavior
  x86: use GFP_DMA for 24bit coherent_dma_mask
  swiotlb: remove panic for alloc_coherent failure
  xen: compilation fix of drivers/xen/events.c on IA64
  xen: portability clean up and some minor clean up for xencomm.c
  xen: don't reload cr3 on suspend
  kernel/resource: fix reserve_region_with_split() section mismatch
  printk: remove unused code from kernel/printk.c
2008-10-28 09:49:27 -07:00
Joerg Roedel
87c6f40128 x86, gart: fix gart detection for Fam11h CPUs
Impact: fix AMD Family 11h boot hangs / USB device problems

The AMD Fam11h CPUs have a K8 northbridge. This northbridge is different
from other family's because it lacks GART support (as I just learned).

But the kernel implicitly expects a GART if it finds an AMD northbridge.

Fix this by removing the Fam11h northbridge id from the scan list of K8
northbridges. This patch also changes the message in the GART driver
about missing K8 northbridges to tell that the GART is missing which is
the correct information in this case.

Reported-by: Jouni Malinen <jkmalinen@gmail.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 17:10:27 +01:00
Ken'ichi Ohmichi
e7706fc691 x86, kdump: fix invalid access on i386 sparsemem
Impact: fix kdump crash on 32-bit sparsemem kernels

Since linux-2.6.27, kdump has failed on i386 sparsemem kernel.
1st-kernel gets a panic just before switching to 2nd-kernel.

The cause is that a kernel accesses invalid mem_section by
page_to_pfn(image->swap_page) at machine_kexec().
image->swap_page is allocated if kexec for hibernation, but
it is not allocated if kdump. So if kdump, a kernel should
not access the mem_section corresponding to image->swap_page.

The attached patch fixes this invalid access.

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Cc: kexec-ml <kexec@lists.infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:48:24 +01:00
Peter Oruba
3c52204bb9 x86: AMD microcode patch loader author update
Removed author's email address from MODULE_AUTHOR.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:41:28 +01:00
Peter Oruba
36b75da27b x86: microcode patch loader author update
Removed one author's email address from module init message.

Signed-off-by: Peter Oruba <peter.oruba@amd.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-28 16:41:16 +01:00
Steven Rostedt
8115f3f0c9 ftrace: use a real variable for ftrace_nop in x86
Impact: avoid section mismatch warning, clean up

The dynamic ftrace determines which nop is safe to use at start up.
When it finds a safe nop for patching, it sets a pointer called ftrace_nop
to point to the code. All call sites are then patched to this nop.

Later, when tracing is turned on, this ftrace_nop variable is again used
to compare the location to make sure it is a nop before we update it to
an mcount call. If this fails just once, a warning is printed and ftrace
is disabled.

Rakib Mullick noted that the code that sets up the nop is a .init section
where as the nop itself is in the .text section. This is needed because
the nop is used later on after boot up. The problem is that the test of the
nop jumps back to the setup code and causes a "section mismatch" warning.

Rakib first recommended to convert the nop to .init.text, but as stated
above, this would fail since that text is used later.

The real solution is to extend Rabik's patch, and to make the ftrace_nop
into an array, and just save the code from the assembly to this array.

Now the section can stay as an init section, and we have a nop to use
later on.

Reported-by: Rakib Mullick <rakib.mullick@gmail.com>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-10-27 16:52:01 +01:00