Commit Graph

215 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
1c3cc11602 timers: Consider slack value in mod_timer()
There is an optimization which does not update the timer if the timer
was pending and the expiration time was unchanged.

Since commit 3bbb9ec9 ("timers: Introduce the concept of timer slack
for legacy timers") this optimization is no longer applied for timers
where the expiration time got extended due to the slack value. So we
need to check again after the expiration time might have been updated.

[ tglx: Made it a single check by applying slack first and sorting
  out the slack = 0 value (all timeouts < 256 jiffies) early ]

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Link: http://lkml.kernel.org/r/20110521105828.GA29442@Chamillionaire.breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-03 15:02:32 +02:00
Linus Torvalds
420c1c572d Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
  posix-clocks: Check write permissions in posix syscalls
  hrtimer: Remove empty hrtimer_init_hres_timer()
  hrtimer: Update hrtimer->state documentation
  hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
  timers: Export CLOCK_BOOTTIME via the posix timers interface
  timers: Add CLOCK_BOOTTIME hrtimer base
  time: Extend get_xtime_and_monotonic_offset() to also return sleep
  time: Introduce get_monotonic_boottime and ktime_get_boottime
  hrtimers: extend hrtimer base code to handle more then 2 clockids
  ntp: Remove redundant and incorrect parameter check
  mn10300: Switch do_timer() to xtimer_update()
  posix clocks: Introduce dynamic clocks
  posix-timers: Cleanup namespace
  posix-timers: Add support for fd based clocks
  x86: Add clock_adjtime for x86
  posix-timers: Introduce a syscall for clock tuning.
  time: Splitout compat timex accessors
  ntp: Add ADJ_SETOFFSET mode bit
  time: Introduce timekeeping_inject_offset
  posix-timer: Update comment
  ...

Fix up new system-call-related conflicts in
	arch/x86/ia32/ia32entry.S
	arch/x86/include/asm/unistd_32.h
	arch/x86/include/asm/unistd_64.h
	arch/x86/kernel/syscall_table_32.S
(name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
due to movement of get_jiffies_64() in:
	kernel/time.c
2011-03-15 18:53:35 -07:00
Linus Torvalds
0586bed3e8 Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rtmutex: tester: Remove the remaining BKL leftovers
  lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
  rtmutex: Simplify PI algorithm and make highest prio task get lock
  rwsem: Remove redundant asmregparm annotation
  rwsem: Move duplicate function prototypes to linux/rwsem.h
  rwsem: Unify the duplicate rwsem_is_locked() inlines
  rwsem: Move duplicate init macros and functions to linux/rwsem.h
  rwsem: Move duplicate struct rwsem declaration to linux/rwsem.h
  x86: Cleanup rwsem_count_t typedef
  rwsem: Cleanup includes
  locking: Remove deprecated lock initializers
  cred: Replace deprecated spinlock initialization
  kthread: Replace deprecated spinlock initialization
  xtensa: Replace deprecated spinlock initialization
  um: Replace deprecated spinlock initialization
  sparc: Replace deprecated spinlock initialization
  mips: Replace deprecated spinlock initialization
  cris: Replace deprecated spinlock initialization
  alpha: Replace deprecated spinlock initialization
  rtmutex-tester: Remove BKL tests
2011-03-15 18:28:30 -07:00
Stanislaw Gruszka
9977728840 debugobjects: Add hint for better object identification
In complex subsystems like mac80211 structures can contain several
timers and work structs, so identifying a specific instance from the
call trace and object type output of debugobjects can be hard.

Allow the subsystems which support debugobjects to provide a hint
function. This function returns a pointer to a kernel address
(preferrably the objects callback function) which is printed along
with the debugobjects type.

Add hint methods for timer_list, work_struct and hrtimer.

[ tglx: Massaged changelog, made it compile ]

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
LKML-Reference: <20110307085809.GA9334@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-03-08 16:10:38 +01:00
Steven Rostedt
48228f7b47 lockdep/timers: Explain in detail the locking problems del_timer_sync() may cause
Twice I had to explain the output about why lockdep gives an error with
locks in IRQ context and with del_timer_sync(). Might as well write it
up and place it in the comments above the code in del_timer_sync().
Perhaps the next time this lockdep dump triggers people will understand
the issues.

It is a ticky issue and very subtle, explaining it in detail in the code
may help others understand the issue when they stumble upon the bug
again.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1297186794.23343.19.camel@gandalf.stny.rr.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:35:08 +01:00
Peter Zijlstra
7ff207928e Revert "lockdep, timer: Fix del_timer_sync() annotation"
Both attempts at trying to allow softirq usage for
del_timer_sync() failed (produced bogus warnings),
so revert the commit for this release:

  f266a5110d: lockdep, timer: Fix del_timer_sync() annotation

and try again later.

Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yong Zhang <yong.zhang0@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1297174680.13327.107.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-08 16:18:39 +01:00
Peter Zijlstra
f266a5110d lockdep, timer: Fix del_timer_sync() annotation
Calling local_bh_enable() will want to actually start processing
softirqs, which isn't a good idea since this can get called with IRQs
disabled.

Cure this by using _local_bh_enable() which doesn't start processing
softirqs, and use raw_local_irq_save() to avoid any softirqs from
happening without letting lockdep think IRQs are in fact disabled.

Reported-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reviewed-by: Yong Zhang <yong.zhang0@gmail.com>
LKML-Reference: <20110203141548.039540914@chello.nl>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-04 10:31:22 +01:00
Torben Hohn
871cf1e5f2 time: Move do_timer() to kernel/time/timekeeping.c
do_timer() is primary timekeeping related. calc_global_load() is
called from do_timer() as well, but that's more for historical
reasons.

[ tglx: Fixed up the calc_global_load() reject andmassaged changelog ]

Signed-off-by: Torben Hohn <torbenh@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: johnstul@us.ibm.com
Cc: yong.zhang0@gmail.com
Cc: hch@infradead.org
LKML-Reference: <20110127145855.23248.56933.stgit@localhost>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-01-31 14:55:41 +01:00
Linus Torvalds
dda5f0a372 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  MAINTAINERS: Update timer related entries
  timers: Use this_cpu_read
  timerqueue: Make timerqueue_getnext() static inline
  hrtimer: fix timerqueue conversion flub
  hrtimers: Convert hrtimers to use timerlist infrastructure
  timers: Fixup allmodconfig build issue
  timers: Rename timerlist infrastructure to timerqueue
  timers: Introduce timerlist infrastructure.
  hrtimer: Remove stale comment on curr_timer
  timer: Warn when del_timer_sync() is called in hardirq context
  timer: Del_timer_sync() can be used in softirq context
  timer: Make try_to_del_timer_sync() the same on SMP and UP
  posix-timers: Annotate lock_timer()
  timer: Permit statically-declared work with deferrable timers
  time: Use ARRAY_SIZE macro in timecompare.c
  timer: Initialize the field slack of timer_list
  timer_list: Remove alignment padding on 64 bit when CONFIG_TIMER_STATS
  time: Compensate for rounding on odd-frequency clocksources

Fix up trivial conflict in MAINTAINERS
2011-01-06 10:42:43 -08:00
Christoph Lameter
7496351ad8 timers: Use this_cpu_read
Eric asked for this.

[tglx: Because it generates faster code according to Erics ]

Signed-off-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: linux-mm@kvack.org
LKML-Reference: <alpine.DEB.2.00.1011301404490.4039@router.home>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-12-12 18:38:09 +01:00
Heiko Carstens
dbd87b5af0 nohz: Fix get_next_timer_interrupt() vs cpu hotplug
This fixes a bug as seen on 2.6.32 based kernels where timers got
enqueued on offline cpus.

If a cpu goes offline it might still have pending timers. These will
be migrated during CPU_DEAD handling after the cpu is offline.
However while the cpu is going offline it will schedule the idle task
which will then call tick_nohz_stop_sched_tick().

That function in turn will call get_next_timer_intterupt() to figure
out if the tick of the cpu can be stopped or not. If it turns out that
the next tick is just one jiffy off (delta_jiffies == 1)
tick_nohz_stop_sched_tick() incorrectly assumes that the tick should
not stop and takes an early exit and thus it won't update the load
balancer cpu.

Just afterwards the cpu will be killed and the load balancer cpu could
be the offline cpu.

On 2.6.32 based kernel get_nohz_load_balancer() gets called to decide
on which cpu a timer should be enqueued (see __mod_timer()). Which
leads to the possibility that timers get enqueued on an offline cpu.
These will never expire and can cause a system hang.

This has been observed 2.6.32 kernels. On current kernels
__mod_timer() uses get_nohz_timer_target() which doesn't have that
problem. However there might be other problems because of the too
early exit tick_nohz_stop_sched_tick() in case a cpu goes offline.

The easiest and probably safest fix seems to be to let
get_next_timer_interrupt() just lie and let it say there isn't any
pending timer if the current cpu is offline.

I also thought of moving migrate_[hr]timers() from CPU_DEAD to
CPU_DYING, but seeing that there already have been fixes at least in
the hrtimer code in this area I'm afraid that this could add new
subtle bugs.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20101201091109.GA8984@osiris.boeblingen.de.ibm.com>
Cc: stable@kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-08 20:15:07 +01:00
Peter Zijlstra
0f004f5a69 sched: Cure more NO_HZ load average woes
There's a long-running regression that proved difficult to fix and
which is hitting certain people and is rather annoying in its effects.

Damien reported that after 74f5187ac8 (sched: Cure load average vs
NO_HZ woes) his load average is unnaturally high, he also noted that
even with that patch reverted the load avgerage numbers are not
correct.

The problem is that the previous patch only solved half the NO_HZ
problem, it addressed the part of going into NO_HZ mode, not of
comming out of NO_HZ mode. This patch implements that missing half.

When comming out of NO_HZ mode there are two important things to take
care of:

 - Folding the pending idle delta into the global active count.
 - Correctly aging the averages for the idle-duration.

So with this patch the NO_HZ interaction should be complete and
behaviour between CONFIG_NO_HZ=[yn] should be equivalent.

Furthermore, this patch slightly changes the load average computation
by adding a rounding term to the fixed point multiplication.

Reported-by: Damien Wyart <damien.wyart@free.fr>
Reported-by: Tim McGrath <tmhikaru@gmail.com>
Tested-by: Damien Wyart <damien.wyart@free.fr>
Tested-by: Orion Poplawski <orion@cora.nwra.com>
Tested-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Chase Douglas <chase.douglas@canonical.com>
LKML-Reference: <1291129145.32004.874.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-08 20:15:04 +01:00
Yong Zhang
466bd30309 timer: Warn when del_timer_sync() is called in hardirq context
Add explict warning when del_timer_sync() is called in hardirq
context.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-22 14:46:25 +02:00
Yong Zhang
1118e2cd33 timer: Del_timer_sync() can be used in softirq context
Actually we have used del_timer_sync() in softirq context for a long time,
e.g. in __dst_free()::cancel_delayed_work().

So change the comments of it to warn on hardirq context only, and make
lockdep know about this change.

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-22 14:46:25 +02:00
Yong Zhang
6f1bc451e6 timer: Make try_to_del_timer_sync() the same on SMP and UP
On UP try_to_del_timer_sync() is mapped to del_timer() which does not
take the running timer callback into account, so it has different
semantics.

Remove the SMP dependency of try_to_del_timer_sync() by using
base->running_timer in the UP case as well.

[ tglx: Removed set_running_timer() inline and tweaked the changelog ]

Signed-off-by: Yong Zhang <yong.zhang0@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-22 14:46:25 +02:00
Phil Carmody
dd6414b50f timer: Permit statically-declared work with deferrable timers
Currently, you have to just define a delayed_work uninitialised, and then
initialise it before first use.  That's a tad clumsy.  At risk of playing
mind-games with the compiler, fooling it into doing pointer arithmetic
with compile-time-constants, this lets clients properly initialise delayed
work with deferrable timers statically.

This patch was inspired by the issues which lead Artem Bityutskiy to
commit 8eab945c56 ("sunrpc: make the cache cleaner workqueue
deferrable").

Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
Acked-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Cc: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-10-21 17:30:06 +02:00
Peter Zijlstra
e360adbe29 irq_work: Add generic hardirq context callbacks
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-18 19:58:50 +02:00
Randy Dunlap
0caa621065 kernel/timer.c: fix kernel-doc function parameter warning
Fix kernel-doc warning, add @timer description:

  Warning(kernel/timer.c:335): No description found for parameter 'timer'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-10 15:33:09 -07:00
Linus Torvalds
af39008435 Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  Documentation: Add timers/timers-howto.txt
  timer: Added usleep_range timer
  Revert "timer: Added usleep[_range] timer"
  clockevents: Remove the per cpu tick skew
  posix_timer: Move copy_to_user(created_timer_id) down in timer_create()
  timer: Added usleep[_range] timer
  timers: Document meaning of deferrable timer
2010-08-06 13:12:36 -07:00
Linus Torvalds
c4efd6b569 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (27 commits)
  sched: Use correct macro to display sched_child_runs_first in /proc/sched_debug
  sched: No need for bootmem special cases
  sched: Revert nohz_ratelimit() for now
  sched: Reduce update_group_power() calls
  sched: Update rq->clock for nohz balanced cpus
  sched: Fix spelling of sibling
  sched, cpuset: Drop __cpuexit from cpu hotplug callbacks
  sched: Fix the racy usage of thread_group_cputimer() in fastpath_timer_check()
  sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()
  sched: thread_group_cputime: Simplify, document the "alive" check
  sched: Remove the obsolete exit_state/signal hacks
  sched: task_tick_rt: Remove the obsolete ->signal != NULL check
  sched: __sched_setscheduler: Read the RLIMIT_RTPRIO value lockless
  sched: Fix comments to make them DocBook happy
  sched: Fix fix_small_capacity
  powerpc: Exclude arch_sd_sibiling_asym_packing() on UP
  powerpc: Enable asymmetric SMT scheduling on POWER7
  sched: Add asymmetric group packing option for sibling domain
  sched: Fix capacity calculations for SMT4
  sched: Change nohz idle load balancing logic to push model
  ...
2010-08-06 09:39:22 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Ingo Molnar
61be7fdec2 Merge branch 'perf/nmi' into perf/core
Conflicts:
	kernel/Makefile

Merge reason: Add the now complete topic, fix the conflict.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-05 08:45:05 +02:00
Patrick Pannuto
5e7f5a178b timer: Added usleep_range timer
usleep_range is a finer precision implementations of msleep
and is designed to be a drop-in replacement for udelay where
a precise sleep / busy-wait is unnecessary.

Since an easy interface to hrtimers could lead to an undesired
proliferation of interrupts, we provide only a "range" API,
forcing the caller to think about an acceptable tolerance on
both ends and hopefully avoiding introducing another interrupt.

INTRO

As discussed here ( http://lkml.org/lkml/2007/8/3/250 ), msleep(1) is not
precise enough for many drivers (yes, sleep precision is an unfair notion,
but consistently sleeping for ~an order of magnitude greater than requested
is worth fixing). This patch adds a usleep API so that udelay does not have
to be used. Obviously not every udelay can be replaced (those in atomic
contexts or being used for simple bitbanging come to mind), but there are
many, many examples of

mydriver_write(...)
/* Wait for hardware to latch */
udelay(100)

in various drivers where a busy-wait loop is neither beneficial nor
necessary, but msleep simply does not provide enough precision and people
are using a busy-wait loop instead.

CONCERNS FROM THE RFC

Why is udelay a problem / necessary? Most callers of udelay are in device/
driver initialization code, which is serial...

	As I see it, there is only benefit to sleeping over a delay; the
	notion of "refactoring" areas that use udelay was presented, but
	I see usleep as the refactoring. Consider i2c, if the bus is busy,
	you need to wait a bit (say 100us) before trying again, your
	current options are:

		* udelay(100)
		* msleep(1) <-- As noted above, actually as high as ~20ms
				on some platforms, so not really an option
		* Manually set up an hrtimer to try again in 100us (which
		  is what usleep does anyway...)

	People choose the udelay route because it is EASY; we need to
	provide a better easy route.

	Device / driver / boot code is *currently* serial, but every few
	months someone makes noise about parallelizing boot, and IMHO, a
	little forward-thinking now is one less thing to worry about
	if/when that ever happens

udelay's could be preempted

	Sure, but if udelay plans on looping 1000 times, and it gets
	preempted on loop 200, whenever it's scheduled again, it is
	going to do the next 800 loops.

Is the interruptible case needed?

	Probably not, but I see usleep as a very logical parallel to msleep,
	so it made sense to include the "full" API. Processors are getting
	faster (albeit not as quickly as they are becoming more parallel),
	so if someone wanted to be interruptible for a few usecs, why not
	let them? If this is a contentious point, I'm happy to remove it.

OTHER THOUGHTS

I believe there is also value in exposing the usleep_range option; it gives
the scheduler a lot more flexibility and allows the programmer to express
his intent much more clearly; it's something I would hope future driver
writers will take advantage of.

To get the results in the NUMBERS section below, I literally s/udelay/usleep
the kernel tree; I had to go in and undo the changes to the USB drivers, but
everything else booted successfully; I find that extremely telling in and
of itself -- many people are using a delay API where a sleep will suit them
just fine.

SOME ATTEMPTS AT NUMBERS

It turns out that calculating quantifiable benefit on this is challenging,
so instead I will simply present the current state of things, and I hope
this to be sufficient:

How many udelay calls are there in 2.6.35-rc5?

	udealy(ARG) >=	| COUNT
	1000		| 319
	500		| 414
	100		| 1146
	20		| 1832

I am working on Android, so that is my focus for this. The following table
is a modified usleep that simply printk's the amount of time requested to
sleep; these tests were run on a kernel with udelay >= 20 --> usleep

"boot" is power-on to lock screen
"power collapse" is when the power button is pushed and the device suspends
"resume" is when the power button is pushed and the lock screen is displayed
         (no touchscreen events or anything, just turning on the display)
"use device" is from the unlock swipe to clicking around a bit; there is no
	sd card in this phone, so fail loading music, video, camera

	ACTION		| TOTAL NUMBER OF USLEEP CALLS	| NET TIME (us)
	boot		| 22				| 1250
	power-collapse	| 9				| 1200
	resume		| 5				| 500
	use device	| 59				| 7700

The most interesting category to me is the "use device" field; 7700us of
busy-wait time that could be put towards better responsiveness, or at the
least less power usage.

Signed-off-by: Patrick Pannuto <ppannuto@codeaurora.org>
Cc: apw@canonical.com
Cc: corbet@lwn.net
Cc: arjan@linux.intel.com
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-08-04 11:00:45 +02:00
Thomas Gleixner
e1b004c3ef Revert "timer: Added usleep[_range] timer"
This reverts commit 22b8f15c2f to merge
an advanced version.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-08-04 10:53:00 +02:00
Jesse Barnes
8cadd2831b timer: add on-stack deferrable timer interfaces
In some cases (for instance with kernel threads) it may be desireable to
use on-stack deferrable timers to get their power saving benefits.  Add
interfaces to support this for the IPS driver.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
2010-08-03 09:48:45 -04:00