Commit Graph

1860 Commits

Author SHA1 Message Date
Thomas Gleixner 303e967ff9 [PATCH] hrtimers; add state tracking
Reintroduce ktimers feature "optimized away" by the ktimers review process:
multiple hrtimer states to enable the running of hrtimers without holding the
cpu-base-lock.

(The "optimized" rbtree hack carried only 2 states worth of information and we
need 4 for high resolution timers and dynamic ticks.)

No functional changes.

Build-fixes-from: Andrew Morton <akpm@osdl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Thomas Gleixner 3c8aa39d7c [PATCH] hrtimers: cleanup locking
Improve kernel/hrtimers.c locking: use a per-CPU base with a lock to control
locking of all clocks belonging to a CPU.  This simplifies code that needs to
lock all clocks at once.  This makes life easier for high-res timers and
dyntick.

No functional changes.

[ optimization change from Andrew Morton <akpm@osdl.org> ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Thomas Gleixner c9cb2e3d7c [PATCH] hrtimers: namespace and enum cleanup
- hrtimers did not use the hrtimer_restart enum and relied on the implict
  int representation. Fix the prototypes and the functions using the enums.
- Use seperate name spaces for the enumerations
- Convert hrtimer_restart macro to inline function
- Add comments

No functional changes.

[akpm@osdl.org: fix input driver]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Thomas Gleixner fd064b9b77 [PATCH] Extend next_timer_interrupt() to use a reference jiffie
For CONFIG_NO_HZ we need to calculate the next timer wheel event based on a
given jiffie value.  Extend the existing code to allow the extra 'now'
argument.  Provide a compability function for the existing implementations to
call the function with now == jiffies.  (This also solves the racyness of the
original code vs.  jiffies changing during the iteration.)

No functional changes to existing users of this infrastructure.

[ remove WARN_ON() that triggered on s390, by Carsten Otte <cotte@de.ibm.com> ]
[ made new helper static, Adrian Bunk <bunk@stusta.de> ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Thomas Gleixner 1cfd68496e [PATCH] Fix cascade lookup of next_timer_interrupt
When searching for the next pending timer in the timer wheel we need to take
the cascade into account.  The current code has several problems:

 1. it looks into the previous cascade
 2. it ignores a pending cascade
 3. it ignores multiple cascades

Change the cascade lookup, so it calculates the array index from the point of
the next cascade and always look at the cascade buckets, when the cascade is
pending, i.e.  gets executed in the next timer softirq.  When multiple
cascades are pending, then lookup the next buckets too.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Ingo Molnar dde4b2b5f4 [PATCH] uninline irq_enter()
Uninline irq_enter().  [dynticks adds more stuff to it]

No functional changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:58 -08:00
Thomas Gleixner 5d8b34fdcb [PATCH] clocksource: Add verification (watchdog) helper
The TSC needs to be verified against another clocksource.  Instead of using
hardwired assumptions of available hardware, provide a generic verification
mechanism.  The verification uses the best available clocksource and handles
the usability for high resolution timers / dynticks of the clocksource which
needs to be verified.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 7e69f2b1ea [PATCH] clocksource: Remove the update callback
The clocksource code allows direct updates of the rating of a given
clocksource now.  Change TSC unstable tracking to use this interface and
remove the update callback.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 73b08d2aa4 [PATCH] clocksource: replace is_continuous by a flag field
Using a flag filed allows to encode more than one information into a variable.
Preparatory patch for the generic clocksource verification.

[mingo@elte.hu: convert vmitime.c to the new clocksource flag]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Thomas Gleixner 92c7e00254 [PATCH] Simplify the registration of clocksources
Enqueue clocksources in rating order to make selection of the clocksource
easier.  Also check the match with an user override at enqueue time.

Preparatory patch for the generic clocksource verification.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
John Stultz 411187fb05 [PATCH] GTOD: persistent clock support
Persistent clock support: do proper timekeeping across suspend/resume.

[bunk@stusta.de: cleanup]
Signed-off-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:57 -08:00
Ingo Molnar 41cf54455d [PATCH] Fix multiple conversion bugs in msecs_to_jiffies
Fix multiple conversion bugs in msecs_to_jiffies().

The main problem is that this condition:

	if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))

overflows if HZ is smaller than 1000!

This change is user-visible: for HZ=250 SUS-compliant poll()-timeout
value of -20 is mistakenly converted to 'immediate timeout'.

(The new dyntick code also triggered this, that's how we noticed.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Ingo Molnar 8b9365d753 [PATCH] Uninline jiffies.h functions
There are loads of fat functions hidden in jiffies.h.  Uninline them.  No code
changes.

[jeremy@goop.org: export fix]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
john stultz f4304ab215 [PATCH] HZ free ntp
Distangle the NTP update from HZ.  This is necessary for dynamic tick enabled
kernels.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Thomas Gleixner 771ee3b04e [PATCH] Add a function to handle interrupt affinity setting
Provide funtions to:
 - check, whether an interrupt can set the affinity
 - pin the interrupt to a given cpu

Necessary for the ability to setup clocksources more flexible (e.g.  use the
different HPET channels per CPU)

[akpm@osdl.org: alpha build fix]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Thomas Gleixner 950f4427c2 [PATCH] Add irq flag to disable balancing for an interrupt
Add a flag so we can prevent the irq balancing of an interrupt.  Move the
bits, so we have room for more :)

Necessary for the ability to setup clocksources more flexible (e.g.  use the
different HPET channels per CPU)

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-16 08:13:56 -08:00
Linus Torvalds 414f827c46 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (94 commits)
  [PATCH] x86-64: Remove mk_pte_phys()
  [PATCH] i386: Fix broken CONFIG_COMPAT_VDSO on i386
  [PATCH] i386: fix 32-bit ioctls on x64_32
  [PATCH] x86: Unify pcspeaker platform device code between i386/x86-64
  [PATCH] i386: Remove extern declaration from mm/discontig.c, put in header.
  [PATCH] i386: Rename cpu_gdt_descr and remove extern declaration from smpboot.c
  [PATCH] i386: Move mce_disabled to asm/mce.h
  [PATCH] i386: paravirt unhandled fallthrough
  [PATCH] x86_64: Wire up compat epoll_pwait
  [PATCH] x86: Don't require the vDSO for handling a.out signals
  [PATCH] i386: Fix Cyrix MediaGX detection
  [PATCH] i386: Fix warning in cpu initialization
  [PATCH] i386: Fix warning in microcode.c
  [PATCH] x86: Enable NMI watchdog for AMD Family 0x10 CPUs
  [PATCH] x86: Add new CPUID bits for AMD Family 10 CPUs in /proc/cpuinfo
  [PATCH] i386: Remove fastcall in paravirt.[ch]
  [PATCH] x86-64: Fix wrong gcc check in bitops.h
  [PATCH] x86-64: survive having no irq mapping for a vector
  [PATCH] i386: geode configuration fixes
  [PATCH] i386: add option to show more code in oops reports
  ...
2007-02-14 09:46:06 -08:00
Eric W. Biederman d912b0cc1a [PATCH] sysctl: add a parent entry to ctl_table and set the parent entry
Add a parent entry into the ctl_table so you can walk the list of parents and
find the entire path to a ctl_table entry.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:10:00 -08:00
Eric W. Biederman 77b14db502 [PATCH] sysctl: reimplement the sysctl proc support
With this change the sysctl inodes can be cached and nothing needs to be done
when removing a sysctl table.

For a cost of 2K code we will save about 4K of static tables (when we remove
de from ctl_table) and 70K in proc_dir_entries that we will not allocate, or
about half that on a 32bit arch.

The speed feels about the same, even though we can now cache the sysctl
dentries :(

We get the core advantage that we don't need to have a 1 to 1 mapping between
ctl table entries and proc files.  Making it possible to have /proc/sys vary
depending on the namespace you are in.  The currently merged namespaces don't
have an issue here but the network namespace under /proc/sys/net needs to have
different directories depending on which network adapters are visible.  By
simply being a cache different directories being visible depending on who you
are is trivial to implement.

[akpm@osdl.org: fix uninitialised var]
[akpm@osdl.org: fix ARM build]
[bunk@stusta.de: make things static]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:10:00 -08:00
Eric W. Biederman 1ff007eb8e [PATCH] sysctl: allow sysctl_perm to be called from outside of sysctl.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 805b5d5e06 [PATCH] sysctl: factor out sysctl_head_next from do_sysctl
The current logic to walk through the list of sysctl table headers is slightly
painful and implement in a way it cannot be used by code outside sysctl.c

I am in the process of implementing a version of the sysctl proc support that
instead of using the proc generic non-caching monster, just uses the existing
sysctl data structure as backing store for building the dcache entries and for
doing directory reads.  To use the existing data structures however I need a
way to get at them.

[akpm@osdl.org: warning fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 0b4d414714 [PATCH] sysctl: remove insert_at_head from register_sysctl
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name.  Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Corey Minyard <minyard@acm.org>
Cc: Neil Brown <neilb@suse.de>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: James Bottomley <James.Bottomley@steeleye.com>
Cc: Jan Kara <jack@ucw.cz>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Mark Fasheh <mark.fasheh@oracle.com>
Cc: David Chinner <dgc@sgi.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman ae83681026 [PATCH] sysctl: remove support for directory strategy routines
parse_table has support for calling a strategy routine when descending into a
directory.  To date no one has used this functionality and the /proc/sys
interface has no analog to it.

So no one is using this functionality kill it and make the binary sysctl code
easier to follow.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 6703ddfcce [PATCH] sysctl: remove support for CTL_ANY
There are currently no users in the kernel for CTL_ANY and it only has effect
on the binary interface which is practically unused.

So this complicates sysctl lookups for no good reason so just remove it.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00
Eric W. Biederman 2abc26fc6b [PATCH] sysctl: create sys/fs/binfmt_misc as an ordinary sysctl entry
binfmt_misc has a mount point in the middle of the sysctl and that mount point
is created as a proc_generic directory.

Doing it that way gets in the way of cleaning up the sysctl proc support as it
continues the existence of a horrible hack.  So instead simply create the
directory as an ordinary sysctl directory.  At least that removes the magic
special case.

[akpm@osdl.org: warning fix]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-02-14 08:09:59 -08:00