Commit Graph

1389 Commits

Author SHA1 Message Date
Roland McGrath
10faa81e10 x86: debugctlmsr arch_has_block_step
This implements user-mode step-until-branch on x86 using the BTF bit
in MSR_IA32_DEBUGCTLMSR.  It's just like single-step, only less so.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
7e9916040b x86: debugctlmsr context switch
This adds low-level support for a per-thread value of MSR_IA32_DEBUGCTLMSR.
The per-thread value is switched in when TIF_DEBUGCTLMSR is set.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:54 +01:00
Roland McGrath
d9771e8c50 x86: x86-32 ptrace debugreg cleanup
This cleans up the 32-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR.  The new functions ptrace_[gs]et_debugreg match the
new 64-bit entry points for parity, but they don't need to be global.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
962ff3804d x86: x86-64 ptrace debugreg cleanup
This cleans up the 64-bit ptrace code to separate the guts of the
debug register access from the implementation of PTRACE_PEEKUSR and
PTRACE_POKEUSR.  The new functions ptrace_[gs]et_debugreg are made
global so that the ia32 code can later be changed to call them too.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
e4aed6cc45 x86-64 ptrace: use task_pt_regs
This cleans up the 64-bit ptrace code to use task_pt_regs instead of its
own redundant code that does the same thing a different way.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
62a97d447b x86-32 ptrace: use task_pt_regs
This cleans up the 32-bit ptrace code to use task_pt_regs instead of its
own redundant code that does the same thing a different way.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:52 +01:00
Roland McGrath
227195d4a6 x86-32: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 32-bit
ptrace code, so it uses the new generic code via ptrace_request.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:51 +01:00
Roland McGrath
18982c158f x86-64: ptrace generic resume
This removes the handling for PTRACE_CONT et al from the 64-bit
ptrace code, so it uses the new generic code via ptrace_request.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:51 +01:00
Roland McGrath
e1f287735c x86 single_step: TIF_FORCED_TF
This changes the single-step support to use a new thread_info flag
TIF_FORCED_TF instead of the PT_DTRACE flag in task_struct.ptrace.
This keeps arch implementation uses out of this non-arch field.

This changes the ptrace access to eflags to mask TF and maintain
the TIF_FORCED_TF flag directly if userland sets TF, instead of
relying on ptrace_signal_deliver.  The 64-bit and 32-bit kernels
are harmonized on this same behavior.  The ptrace_signal_deliver
approach works now, but this change makes the low-level register
access code reliable when called from different contexts than a
ptrace stop, which will be possible in the future.

The 64-bit do_debug exception handler is also changed not to clear TF
from user-mode registers.  This matches the 32-bit kernel's behavior.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
7122ec8158 x86: single_step: share code
This removes the single-step code from ptrace_32.c and uses the step.c code
shared with the 64-bit kernel.  The two versions of the code were nearly
identical already, so the shared code has only a couple of simple #ifdef's.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
5f76cb1f6c x86: single_step 0xf0
This fixes the 64-bit single-step handling code's instruction
decoder to grok the 0xf0 (lock) prefix, which the 32-bit code
already does correctly.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
3f80c1adc9 x86: single_step segment macros
This cleans up the single-step code to use the asm/segment.h macros
for segment selector magic bits, rather than its own constant.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
fa1e03eae2 x86: single_step moved
This moves the single-step support code from ptrace_64.c into a new file
step.c, verbatim.  This paves the way for consolidating this code between
64-bit and 32-bit versions.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:50 +01:00
Roland McGrath
7f232343e0 x86: arch_has_single_step
This defines the new standard arch_has_single_step macro.  It makes the
existing set_singlestep and clear_singlestep entry points global, and
renames them to the new standard names user_enable_single_step and
user_disable_single_step, respectively.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:48 +01:00
Roland McGrath
77c03dcd44 x86: remove TRAP_FLAG
This gets rid of the local constant macro TRAP_FLAG.
It's redundant with the public constant macro X86_EFLAGS_TF.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:48 +01:00
Mathieu Desnoyers
2c0b8a7578 x86: fall back on interrupt disable in cmpxchg8b on 80386 and 80486
Actually, on 386, cmpxchg and cmpxchg_local fall back on
cmpxchg_386_u8/16/32: it disables interruptions around non atomic
updates to mimic the cmpxchg behavior.

The comment:
/* Poor man's cmpxchg for 386. Unsuitable for SMP */

already present in cmpxchg_386_u32 tells much about how this cmpxchg
implementation should not be used in a SMP context. However, the cmpxchg_local
can perfectly use this fallback, since it only needs to be atomic wrt the local
cpu.

This patch adds a cmpxchg_486_u64 and uses it as a fallback for cmpxchg64
and cmpxchg64_local on 80386 and 80486.

Q:
but why is it called cmpxchg_486 when the other functions are called

A:
Because the standard cmpxchg is missing only on 386, but cmpxchg8b is
missing both on 386 and 486.

Citing Intel's Instruction set reference:

cmpxchg:
This instruction is not supported on Intel processors earlier than the
Intel486 processors.

cmpxchg8b:
This instruction encoding is not supported on Intel processors earlier
than the Pentium processors.

Q:
What's the reason to have cmpxchg64_local on 32 bit architectures?
Without that need all this would just be a few simple defines.

A:
cmpxchg64_local on 32 bits architectures takes unsigned long long
parameters, but cmpxchg_local only takes longs. Since we have cmpxchg8b
to execute a 8 byte cmpxchg atomically on pentium and +, it makes sense
to provide a flavor of cmpxchg and cmpxchg_local using this instruction.

Also, for 32 bits architectures lacking the 64 bits atomic cmpxchg, it
makes sense _not_ to define cmpxchg64 while cmpxchg could still be
available.

Moreover, the fallback for cmpxchg8b on i386 for 386 and 486 is a

However, cmpxchg64_local will be emulated by disabling interrupts on all
architectures where it is not supported atomically.

Therefore, we *could* turn cmpxchg64_local into a cmpxchg_local, but it
would make the 386/486 fallbacks ugly, make its design different from
cmpxchg/cmpxchg64 (which really depends on atomic operations and cannot
be emulated) and require the __cmpxchg_local to be expressed as a macro
rather than an inline function so the parameters would not be fixed to
unsigned long long in every case.

So I think cmpxchg64_local makes sense there, but I am open to
suggestions.

Q:
Are there any callers?

A:
I am actually using it in LTTng in my timestamping code. I use it to
work around CPUs with asynchronous TSCs. I need to update 64 bits
values atomically on this 32 bits architecture.

Changelog:
- Ran though checkpatch.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Ralf Baechle
5f627f8e12 mips, x86: optimize the i8259 code a bit
The timer code always calls the clock_event_device set_net_event and
set_mode methods with interrupts disabled, so no need to use
spin_lock_irqsave / spin_unlock_irqrestore for those.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by:Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Christoph Lameter
b263295dbf x86: 64-bit, make sparsemem vmemmap the only memory model
Use sparsemem as the only memory model for UP, SMP and NUMA.  Measurements
indicate that DISCONTIGMEM has a higher overhead than sparsemem.  And
FLATMEMs benefits are minimal.  So I think its best to simply standardize
on sparsemem.

Results of page allocator tests (test can be had via git from slab git
tree branch tests)

Measurements in cycle counts. 1000 allocations were performed and then the
average cycle count was calculated.

Order	FlatMem	Discontig	SparseMem
0	  639	  665		  641
1	  567	  647		  593
2	  679	  774		  692
3	  763	  967		  781
4	  961	 1501		  962
5	 1356	 2344		 1392
6	 2224	 3982		 2336
7	 4869	 7225		 5074
8	12500	14048		12732
9	27926	28223		28165
10	58578	58714		58682

(Note that FlatMem is an SMP config and the rest NUMA configurations)

Memory use:

SMP Sparsemem
-------------

Kernel size:

   text    data     bss     dec     hex filename
3849268  397739 1264856 5511863  541ab7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8242252      41164    8201088          0        352      11512
-/+ buffers/cache:      29300    8212952
Swap:      9775512          0    9775512

SMP Flatmem
-----------

Kernel size:

   text    data     bss     dec     hex filename
3844612  397739 1264536 5506887  540747 vmlinux

So 4.5k growth in text size vs. FLATMEM.

             total       used       free     shared    buffers     cached
Mem:       8244052      40544    8203508          0        352      11484
-/+ buffers/cache:      28708    8215344

2k growth in overall memory use after boot.

NUMA discontig:

   text    data     bss     dec     hex filename
3888124  470659 1276504 5635287  55fcd7 vmlinux

             total       used       free     shared    buffers     cached
Mem:       8256256      56908    8199348          0        352      11496
-/+ buffers/cache:      45060    8211196
Swap:      9775512          0    9775512

NUMA sparse:

   text    data     bss     dec     hex filename
3896428  470659 1276824 5643911  561e87 vmlinux

8k text growth. Given that we fully inline virt_to_page and friends now
that is rather good.

             total       used       free     shared    buffers     cached
Mem:       8264720      57240    8207480          0        352      11516
-/+ buffers/cache:      45372    8219348
Swap:      9775512          0    9775512

The total available memory is increased by 8k.

This patch makes sparsemem the default and removes discontig and
flatmem support from x86.

[ akpm@linux-foundation.org: allnoconfig build fix ]

Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:47 +01:00
Borislav Petkov
4ebd1290ba x86: vmlinux_32.lds.S: remove repeated comment from the x86-32 linker script
Remove repeated comment from the linker script for the x86-32 target.

Signed-off-by: Borislav Petkov <bbpetkov@yahoo.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Yinghai Lu
9de819fe72 x86: do not set boot cpu in cpu_online_map at x86_64_start_kernel()
In init/main.c boot_cpu_init() does that later.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Yinghai Lu
949ec325c7 x86: set cpu_index to nr_cpus instead of 0
Some BIOSes that support two/four dualcore/quadcore systems, will get:

ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x02] enabled)
Processor #2 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
Processor #3 15:1 APIC version 16
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x84] disabled)
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x85] disabled)
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x86] disabled)
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x87] disabled)
ACPI: LAPIC (acpi_id[0x09] lapic_id[0x88] disabled)
ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x89] disabled)
ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x8a] disabled)
ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x8b] disabled)
ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x8c] disabled)
ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x8d] disabled)
ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x8e] disabled)
ACPI: LAPIC (acpi_id[0x10] lapic_id[0x8f] disabled)

SMP: Allowing 16 CPUs, 12 hotplug CPUs

the /proc/cpuinfo will show a bunch of NULL cpus with cpu_index=0

so assign impossible cpu_index value at first instead of 0.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Vladimir Berezniker
b3ca74a2bf x86: sanitize user specified e820 memmap values
Sanitize user specified e820 memory ranges, using the same logic that is
applied to the values returned by the BIOS.  This ensures consistent
handling regardless of the source of the memory mappings.

Allows overriding portions of the memory map without specifying one in
it's entirety (memmap=exactmap).

E.g. marking a range of bad RAM as reserved with memmap=48M$528M

BIOS supplied range

BIOS-e820: 0000000000100000 - 000000007fe80000 (usable)

becomes

user: 0000000000100000 - 0000000021000000 (usable)
user: 0000000021000000 - 0000000024000000 (reserved)
user: 0000000024000000 - 000000007fe80000 (usable)

Previously this did not work, as the original BIOS range was left
untouched while the user defined range was appended to the end of the
memory map.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Vladimir Berezniker <vmpn@hitechman.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Roland McGrath
efd1ca52d0 x86: TLS cleanup
This consolidates the four different places that implemented the same
encoding magic for the GDT-slot 32-bit TLS support.  The old tls32.c was
renamed and is now only slightly modified to be the shared implementation.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:46 +01:00
Roland McGrath
13abd0e504 x86: tls32 moved
This renames arch/x86/ia32/tls32.c to arch/x86/kernel/tls.c, which does
nothing now but paves the way to consolidate this code for 32-bit too.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:45 +01:00
Roland McGrath
df5d438e33 x86: ptrace fs/gs_base
The fs_base and gs_base fields are available in user_regs_struct.
But reading these via ptrace (PTRACE_GETREGS or PTRACE_PEEKUSR) does
not give a reliably useful value.  The thread_struct fields are 0
when do_arch_prctl decided to use a GDT slot instead of MSR_FS_BASE,
which it does for a value under 1<<32.

This changes ptrace access to fs_base and gs_base to work like
PTRACE_ARCH_PRCTL does.  That is, it reads the base address that
user-mode memory access using the fs/gs instruction prefixes will
use, regardless of how it's being implemented in the kernel.  The
MSR vs GDT is an implementation detail that is pretty much hidden
from userland in the actual using, and there is no reason that
ptrace should give the internal implementation picture rather than
the user-mode semantic picture.  In the case of setting the value,
this can implicitly change the fsindex/gsindex value (also
separately in user_regs_struct), which is what happens when the
thread calls arch_prctl itself.  In a PTRACE_SETREGS, the fs_base
change will come after the fsindex change due to the order of the
struct, and so a change the debugger made to fs_base will have the
effect intended, another part of the user_regs_struct will now
differ when read back from what the debugger wrote.

This makes PTRACE_ARCH_PRCTL obsolete.  We could consider declaring
it deprecated and removing it one day, though there is no hurry.
For the foreseeable future, debuggers have to assume an old kernel
that does not report reliable fs_base/gs_base values in user_regs_struct
and stick to PTRACE_ARCH_PRCTL anyway.

Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 13:30:45 +01:00