Commit Graph

234172 Commits

Author SHA1 Message Date
Len Brown cdaab4a0d3 x86 idle: deprecate "no-hlt" cmdline param
We'd rather that modern machines not check if HLT works on
every entry into idle, for the benefit of machines that had
marginal electricals 15-years ago.  If those machines are still running
the upstream kernel, they can use "idle=poll".  The only difference
will be that they'll now invoke HLT in machine_hlt().

cc: x86@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:16 -04:00
Len Brown 99c6322143 x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
We don't want to export the pm_idle function pointer to modules.
Currently CONFIG_APM_CPU_IDLE w/ CONFIG_APM_MODULE forces us to.

CONFIG_APM_CPU_IDLE is of dubious value, it runs only on 32-bit
uniprocessor laptops that are over 10 years old.  It calls into
the BIOS during idle, and is known to cause a number of machines
to fail.

Removing CONFIG_APM_CPU_IDLE and will allow us to stop exporting
pm_idle.  Any systems that were calling into the APM BIOS
at run-time will simply use HLT instead.

cc: x86@kernel.org
cc: Jiri Kosina <jkosina@suse.cz>
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:15 -04:00
Len Brown 3b70b2e5fc x86 idle floppy: deprecate disable_hlt()
Plan to remove floppy_disable_hlt in 2012, an ancient
workaround with comments that it should be removed.

This allows us to remove clutter and a run-time branch
from the idle code.

WARN_ONCE() on invocation until it is removed.

cc: x86@kernel.org
cc: stable@kernel.org # .39.x
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:15 -04:00
Len Brown 06ae40ce07 x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
In the long run, we don't want default_idle() or (pm_idle)() to
be exported outside of process.c.  Start by not exporting them
to modules, unless the APM build demands it.

cc: x86@kernel.org
cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:39:14 -04:00
Len Brown 02c68a0201 x86 idle: clarify AMD erratum 400 workaround
The workaround for AMD erratum 400 uses the term "c1e" falsely suggesting:
1. Intel C1E is somehow involved
2. All AMD processors with C1E are involved

Use the string "amd_c1e" instead of simply "c1e" to clarify that
this workaround is specific to AMD's version of C1E.
Use the string "e400" to clarify that the workaround is specific
to AMD processors with Erratum 400.

This patch is text-substitution only, with no functional change.

cc: x86@kernel.org
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 03:38:57 -04:00
Tim Chen 333c5ae994 idle governor: Avoid lock acquisition to read pm_qos before entering idle
Thanks to the reviews and comments by Rafael, James, Mark and Andi.
Here's version 2 of the patch incorporating your comments and also some
update to my previous patch comments.

I noticed that before entering idle state, the menu idle governor will
look up the current pm_qos target value according to the list of qos
requests received.  This look up currently needs the acquisition of a
lock to access the list of qos requests to find the qos target value,
slowing down the entrance into idle state due to contention by multiple
cpus to access this list.  The contention is severe when there are a lot
of cpus waking and going into idle.  For example, for a simple workload
that has 32 pair of processes ping ponging messages to each other, where
64 cpu cores are active in test system, I see the following profile with
37.82% of cpu cycles spent in contention of pm_qos_lock:

-     37.82%          swapper  [kernel.kallsyms]          [k]
_raw_spin_lock_irqsave
   - _raw_spin_lock_irqsave
      - 95.65% pm_qos_request
           menu_select
           cpuidle_idle_call
         - cpu_idle
              99.98% start_secondary

A better approach will be to cache the updated pm_qos target value so
reading it does not require lock acquisition as in the patch below.
With this patch the contention for pm_qos_lock is removed and I saw a
2.2X increase in throughput for my message passing workload.

cc: stable@kernel.org
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Acked-by: mark gross <markgross@thegnar.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 00:50:59 -04:00
Tero Kristo 7467571f44 cpuidle: menu: fixed wrapping timers at 4.294 seconds
Cpuidle menu governor is using u32 as a temporary datatype for storing
nanosecond values which wrap around at 4.294 seconds. This causes errors
in predicted sleep times resulting in higher than should be C state
selection and increased power consumption. This also breaks cpuidle
state residency statistics.

cc: stable@kernel.org # .32.x through .39.x
Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 00:35:47 -04:00
Linus Torvalds 521cb40b0c Linux 2.6.38 2011-03-14 18:20:32 -07:00
Linus Torvalds 59766edc79 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
  MN10300: atomic_read() should ensure it emits a load
  MN10300: The SMP_ICACHE_INV_FLUSH_RANGE IPI command does not exist
  MN10300: Proper use of macros get_user() in the case of incremented pointers
2011-03-14 15:20:39 -07:00
Linus Torvalds 2990821d0e Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (26 commits)
  MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
  MIPS: MTX-1: Make au1000_eth probe all PHY addresses
  MIPS: Jz4740: Add HAVE_CLK
  MIPS: Move idle task creation to work queue
  MIPS, Perf-events: Use unsigned delta for right shift in event update
  MIPS, Perf-events: Work with the new callchain interface
  MIPS, Perf-events: Fix event check in validate_event()
  MIPS, Perf-events: Work with the new PMU interface
  MIPS, Perf-events: Work with irq_work
  MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
  MIPS: Loongson: Fix potentially wrong string handling
  MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
  MIPS: Fix GCC-4.6 'set but not used' warning in ieee754int.h
  MIPS: Remove unused code from arch/mips/kernel/syscall.c
  MIPS: Fix GCC-4.6 'set but not used' warning in signal*.c
  MIPS: MSP: Fix MSP71xx bpci interrupt handler return value
  MIPS: Select R4K timer lib for all MSP platforms
  MIPS: Loongson: Remove ad-hoc cmdline default
  MIPS: Clear the correct flag in sysmips(MIPS_FIXADE, ...).
  MIPS: Add an unreachable return statement to satisfy buggy GCCs.
  ...
2011-03-14 15:20:12 -07:00
Linus Torvalds 869c34f520 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86: ce4100: Set pci ops via callback instead of module init
  x86/mm: Fix pgd_lock deadlock
  x86/mm: Handle mm_fault_error() in kernel space
  x86: Don't check for BIOS corruption in first 64K when there's no need to
2011-03-14 15:19:09 -07:00
Linus Torvalds 52d3c03675 Revert "oom: oom_kill_process: fix the child_points logic"
This reverts the parent commit.  I hate doing that, but it's generating
some discussion ("half of it is right"), and since I am planning on
doing the 2.6.38 release later today we can punt it to stable if
required. Let's not rock the boat right now.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-14 15:17:07 -07:00
Oleg Nesterov dc1b83ab08 oom: oom_kill_process: fix the child_points logic
oom_kill_process() starts with victim_points == 0.  This means that
(most likely) any child has more points and can be killed erroneously.

Also, "children has a different mm" doesn't match the reality, we should
check child->mm != t->mm.  This check is not exactly correct if t->mm ==
NULL but this doesn't really matter, oom_kill_task() will kill them
anyway.

Note: "Kill all processes sharing p->mm" in oom_kill_task() is wrong
too.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-03-14 13:38:35 -07:00
Florian Fainelli 9ced975711 MIPS: Alchemy: Fix reset for MTX-1 and XXS1500
Since commit 32fd6901 (MIPS: Alchemy: get rid of common/reset.c)
Alchemy-based boards use their own reset function. For MTX-1 and XXS1500,
the reset function pokes at the BCSR.SYSTEM_RESET register, but this does
not work. According to Bruno Randolf, this was not tested when written.

Previously, the generic au1000_restart() routine called the board specific
reset function, which for MTX-1 and XXS1500 did not work, but finally made
a jump to the reset vector, which really triggers a system restart. Fix
reboot for both targets by jumping to the reset vector.

Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2093/
Acked-by: Bruno Randolf <br1@einfach.org>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:28 +01:00
Florian Fainelli bf3a1eb859 MIPS: MTX-1: Make au1000_eth probe all PHY addresses
When au1000_eth probes the MII bus for PHY address, if we do not set
au1000_eth platform data's phy_search_highest_address, the MII probing
logic will exit early and will assume a valid PHY is found at address 0.
For MTX-1, the PHY is at address 31, and without this patch, the link
detection/speed/duplex would not work correctly.

CC: stable@kernel.org
Signed-off-by: Florian Fainelli <florian@openwrt.org>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2111/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Maurus Cuelenaere ab5330eb26 MIPS: Jz4740: Add HAVE_CLK
Jz4740 supports the clock framework but doesn't have HAVE_CLK defined,
so define it!

Signed-off-by: Maurus Cuelenaere <mcuelenaere@gmail.com>
To: linux-mips@linux-mips.org
To: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2112/
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Maksim Rayskiy 6667deb69e MIPS: Move idle task creation to work queue
To avoid forking usermode thread when creating an idle task, move fork_idle
to a work queue.

If kernel starts with maxcpus= option which does not bring all available
cpus online at boot time, idle tasks for offline cpus are not created. If
later offline cpus are hotplugged through sysfs, __cpu_up is called in
the context of the user task, and fork_idle copies its non-zero mm
pointer.  This causes BUG() in per_cpu_trap_init.

This also avoids issues with resource limits of the CPU writing to sysfs,
containers, maybe others.

Signed-off-by: Maksim Rayskiy <mrayskiy@broadcom.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2070/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Deng-Cheng Zhu ba9786f324 MIPS, Perf-events: Use unsigned delta for right shift in event update
Leverage the commit for ARM by Will Deacon:

- 446a5a8b1e
    ARM: 6205/1: perf: ensure counter delta is treated as unsigned

    Hardware performance counters on ARM are 32-bits wide but atomic64_t
    variables are used to represent counter data in the hw_perf_event structure.

    The armpmu_event_update function right-shifts a signed 64-bit delta variable
    and adds the result to the event count. This can lead to shifting in sign-bits
    if the MSB of the 32-bit counter value is set. This results in perf output
    such as:

     Performance counter stats for 'sleep 20':

     18446744073460670464  cycles             <-- 0xFFFFFFFFF12A6000
            7783773  instructions             #      0.000 IPC
                465  context-switches
                161  page-faults
            1172393  branches

       20.154242147  seconds time elapsed

    This patch ensures that the delta value is treated as unsigned so that the
    right shift sets the upper bits to zero.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2015/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Deng-Cheng Zhu 98f92f2f9e MIPS, Perf-events: Work with the new callchain interface
This is the MIPS part of the following commits by Frederic Weisbecker:

- f72c1a931e
    perf: Factorize callchain context handling

    Store the kernel and user contexts from the generic layer instead
    of archs, this gathers some repetitive code.

- 56962b4449
    perf: Generalize some arch callchain code

    - Most archs use one callchain buffer per cpu, except x86 that needs
      to deal with NMIs. Provide a default perf_callchain_buffer()
      implementation that x86 overrides.

    - Centralize all the kernel/user regs handling and invoke new arch
      handlers from there: perf_callchain_user() / perf_callchain_kernel()
      That avoid all the user_mode(), current->mm checks and so...

    - Invert some parameters in perf_callchain_*() helpers: entry to the
      left, regs to the right, following the traditional (dst, src).

- 70791ce9ba
    perf: Generalize callchain_store()

    callchain_store() is the same on every archs, inline it in
    perf_event.h and rename it to perf_callchain_store() to avoid
    any collision.

    This removes repetitive code.

- c1a65932fd
    perf: Drop unappropriate tests on arch callchains

    Drop the TASK_RUNNING test on user tasks for callchains as
    this check doesn't seem to make any sense.

    Also remove the tests for !current that is not supposed to
    happen and current->pid as this should be handled at the
    generic level, with exclude_idle attribute.

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Patchwork: http://patchwork.linux-mips.org/patch/2014/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Deng-Cheng Zhu c049b6a5f2 MIPS, Perf-events: Fix event check in validate_event()
Ignore events that are in off/error state or belong to a different PMU.

This patch originates from the following commit for ARM by Will Deacon:

- 65b4711ff5
    ARM: 6352/1: perf: fix event validation

    The validate_event function in the ARM perf events backend has the
    following problems:

    1.) Events that are disabled count towards the cost.
    2.) Events associated with other PMUs [for example, software events or
        breakpoints] do not count towards the cost, but do fail validation,
        causing the group to fail.

    This patch changes validate_event so that it ignores events in the
    PERF_EVENT_STATE_OFF state or that are scheduled for other PMUs.

Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2013/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:27 +01:00
Deng-Cheng Zhu 404ff63840 MIPS, Perf-events: Work with the new PMU interface
This is the MIPS part of the following commits by Peter Zijlstra:

- a4eaf7f146
    perf: Rework the PMU methods

    Replace pmu::{enable,disable,start,stop,unthrottle} with
    pmu::{add,del,start,stop}, all of which take a flags argument.

    The new interface extends the capability to stop a counter while
    keeping it scheduled on the PMU. We replace the throttled state with
    the generic stopped state.

    This also allows us to efficiently stop/start counters over certain
    code paths (like IRQ handlers).

    It also allows scheduling a counter without it starting, allowing for
    a generic frozen state (useful for rotating stopped counters).

    The stopped state is implemented in two different ways, depending on
    how the architecture implemented the throttled state:

     1) We disable the counter:
        a) the pmu has per-counter enable bits, we flip that
        b) we program a NOP event, preserving the counter state

     2) We store the counter state and ignore all read/overflow events

For MIPSXX, the stopped state is implemented in the way of 1.b as above.

- 33696fc0d1
    perf: Per PMU disable

    Changes perf_disable() into perf_pmu_disable().

- 24cd7f54a0
    perf: Reduce perf_disable() usage

    Since the current perf_disable() usage is only an optimization,
    remove it for now. This eases the removal of the __weak
    hw_perf_enable() interface.

- b0a873ebbf
    perf: Register PMU implementations

    Simple registration interface for struct pmu, this provides the
    infrastructure for removing all the weak functions.

- 51b0fe3954
    perf: Deconstify struct pmu

    sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"`

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: a.p.zijlstra@chello.nl
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: wuzhangjin@gmail.com
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: dengcheng.zhu@gmail.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com
Cc: ddaney@caviumnetworks.com
Patchwork: http://patchwork.linux-mips.org/patch/2012/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:26 +01:00
Deng-Cheng Zhu 91f017372a MIPS, Perf-events: Work with irq_work
This is the MIPS part of the following commit by Peter Zijlstra:

- e360adbe29
    irq_work: Add generic hardirq context callbacks

    Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

For MIPSXX, we need to call irq_work_run() at the tail of the perf IRQ
handler as described above.

Reported-by: Wu Zhangjin <wuzhangjin@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
To: fweisbec@gmail.com
To: will.deacon@arm.com
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: paulus@samba.org
Cc: mingo@elte.hu
Cc: acme@redhat.com
Cc: matt@console-pimps.org
Cc: sshtylyov@mvista.com,
Patchwork: http://patchwork.linux-mips.org/patch/2011/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:26 +01:00
Yoichi Yuasa efe8dc556c MIPS: Fix always CONFIG_LOONGSON_UART_BASE=y
Signed-off-by: Yoichi Yuasa <yuasa@linux-mips.org>
Cc: linux-mips <linux-mips@linux-mips.org>
Patchwork: https://patchwork.linux-mips.org/patch/2055/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:26 +01:00
Stefan Weil 994fed2dd2 MIPS: Loongson: Fix potentially wrong string handling
This error was reported by cppcheck:
arch/mips/loongson/common/machtype.c:56: error: Dangerous usage of 'str' (strncpy doesn't always 0-terminate it)

If strncpy copied MACHTYPE_LEN bytes, the destination string str
was not terminated.

The patch adds one more byte to str and makes sure that this byte is
always 0.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Cc: Wu Zhangjin <wuzhangjin@gmail.com>
Cc: Arnaud Patard <apatard@mandriva.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/2053/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:26 +01:00
David Daney d3ce0e98b7 MIPS: Fix GCC-4.6 'set but not used' warning in arch/mips/mm/init.c
Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
but not used'.  Mark it as __maybe_unused to quiet the warning/error.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2033/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2011-03-14 21:07:26 +01:00