* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-64, vdso: Do not allocate memory for the vDSO
clocksource: Change __ARCH_HAS_CLOCKSOURCE_DATA to a CONFIG option
x86, vdso: Drop now wrong comment
Document the vDSO and add a reference parser
ia64: Replace clocksource.fsys_mmio with generic arch data
x86-64: Move vread_tsc and vread_hpet into the vDSO
clocksource: Replace vread with generic arch data
x86-64: Add --no-undefined to vDSO build
x86-64: Allow alternative patching in the vDSO
x86: Make alternative instruction pointers relative
x86-64: Improve vsyscall emulation CS and RIP handling
x86-64: Emulate legacy vsyscalls
x86-64: Fill unused parts of the vsyscall page with 0xcc
x86-64: Remove vsyscall number 3 (venosys)
x86-64: Map the HPET NX
x86-64: Remove kernel.vsyscall64 sysctl
x86-64: Give vvars their own page
x86-64: Document some of entry_64.S
x86-64: Fix alignment of jiffies variable
When interrupts are delayed due to interrupt masking or due to other
interrupts being serviced the HPET periodic-emuation would fail. This
happened because given an interval t and a time for the current interrupt
m we would compute the next time as t + m. This works until we are
delayed for > t, in which case we would be writing a new value which is in
fact in the past.
This can be solved by computing the next time instead as (k * t) + m where
k is large enough to be in the future. The exact computation of k is
described in a comment to the code.
More detail:
Assuming an interval of 5 between each expected interrupt we have a normal
case of
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t5: interrupt, read t5 from comparator, set next interrupt t5 + 5
t10: interrupt, read t10 from comparator, set next interrupt t10 + 5
...
So, what happens when the interrupt is serviced too late?
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t11: delayed interrupt serviced, read t5 from comparator, set next
interrupt t5 + 5, which is in the past!
... counter loops ...
t10: Much much later, get the next interrupt.
This can happen either because we have interrupts masked for too long
(some stupid driver goes on a printk rampage) or just because we are
pushing the limits of the interval (too small a period), or both most
probably.
My solution is to read the main counter as well and set the next interrupt
to occur at the right interval, for example:
t0: interrupt, read t0 from comparator, set next interrupt t0 + 5
t11: delayed interrupt serviced, read t5 from comparator, set next
interrupt t15 as t10 has been missed.
t15: back on track.
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Structure info is copied to userland with some padding fields unitialized.
It leads to leaking of stack memory.
[akpm@linux-foundation.org: remove now-unneeded zeroing of info->hi_ireqfreq]
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix the following style problems:
WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
WARNING: Use #include <linux/io.h> instead of <asm/io.h>
ERROR: code indent should use tabs where possible
ERROR: do not initialise statics to 0 or NULL
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Cc: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jaswinder Singh Rajput wrote:
> By executing Documentation/timers/hpet_example.c
>
> for polling, I requested for 3 iterations but it seems iteration work
> for only 2 as first expired time is always very small.
>
> # ./hpet_example poll /dev/hpet 10 3
> -hpet: executing poll
> hpet_poll: info.hi_flags 0x0
> hpet_poll: expired time = 0x13
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1
> hpet_poll: expired time = 0x1868c
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1
> hpet_poll: expired time = 0x18645
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1
Clearing the HPET interrupt enable bit disables interrupt generation
but does not disable the timer, so the interrupt status bit will still
be set when the timer elapses. If another interrupt arrives before
the timer has been correctly programmed (due to some other device on
the same interrupt line, or CONFIG_DEBUG_SHIRQ), this results in an
extra unwanted interrupt event because the status bit is likely to be
set from comparator matches that happened before the device was opened.
Therefore, we have to ensure that the interrupt status bit is and
stays cleared until we actually program the timer.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Reported-by: Jaswinder Singh Rajput <jaswinderlinux@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Bob Picco <bpicco@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When the initialization code in hpet finds a memory resource and does not
find an IRQ, it does not unmap the memory resource previously mapped.
There are buggy BIOSes which report resources exactly like this and what
is worse the memory region bases point to normal RAM. This normally would
not matter since the space is not touched. But when PAT is turned on,
ioremap causes the page to be uncached and sets this bit in page->flags.
Then when the page is about to be used by the allocator, it is reported
as:
BUG: Bad page state in process md5sum pfn:3ed00
page:ffffea0000dbd800 count:0 mapcount:0 mapping:(null) index:0x0
page flags: 0x20000001000000(uncached)
Pid: 7956, comm: md5sum Not tainted 2.6.34-12-desktop #1
Call Trace:
[<ffffffff810df851>] bad_page+0xb1/0x100
[<ffffffff810dfa45>] prep_new_page+0x1a5/0x1c0
[<ffffffff810dfe01>] get_page_from_freelist+0x3a1/0x640
[<ffffffff810e01af>] __alloc_pages_nodemask+0x10f/0x6b0
...
In this particular case:
1) HPET returns 3ed00000 as memory region base, but it is not in
reserved ranges reported by the BIOS (excerpt):
BIOS-e820: 0000000000100000 - 00000000af6cf000 (usable)
BIOS-e820: 00000000af6cf000 - 00000000afdcf000 (reserved)
2) there is no IRQ resource reported by HPET method. On the other
hand, the Intel HPET specs (1.0a) says (3.2.5.1):
_CRS (
// Report 1K of memory consumed by this Timer Block
memory range consumed
// Optional: only used if BIOS allocates Interrupts [1]
IRQs consumed
)
[1] For case where Timer Block is configured to consume IRQ0/IRQ8 AND
Legacy 8254/Legacy RTC hardware still exists, the device objects
associated with 8254 & RTC devices should not report IRQ0/IRQ8 as
"consumed resources".
So in theory we should check whether if it is the case and use those
interrupts instead.
Anyway the address reported by the BIOS here is bogus, so non-presence
of IRQ doesn't mean the "optional" part in point 2).
Since I got no reply previously, fix this by simply unmapping the space
when IRQ is not found and memory region was mapped previously. It would
be probably more safe to walk the resources again and unmap appropriately
depending on type. But as we now use only ioremap for both 2 memory
resource types, it is not necessarily needed right now.
Addresses https://bugzilla.novell.com/show_bug.cgi?id=629908
Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hpet uses the big kernel lock in its ioctl and open
functions. Replace this with a private mutex to be
sure. Since we're already touching the ioctl function,
add the compat_ioctl version as well -- all commands
except HPET_INFO are compatible and that one is easy
to add.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: Bob Picco <bob.picco@hp.com>
These are the last remaining device drivers using
the ->ioctl file operation in the drivers directory
(except from v4l drivers).
[fweisbec: drop i8k pushdown as it has been done from
procfs pushdown branch already]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
For consistency drop & in front of every proc_handler. Explicity
taking the address is unnecessary and it prevents optimizations
like stubbing the proc_handlers to NULL.
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
The periodic interrupt from drivers/char/hpet.c does not work correctly,
both when using the periodic capability of the hardware and while
emulating the periodic interrupt (when hardware does not support periodic
mode).
With timers capable of periodic interrupts, the comparator field is first
set with the period value followed by set of hidden accumulator, which has
the side effect of overwriting the comparator value. This results in
wrong periodicity for the interrupts. For, periodic interrupts to work,
following steps are necessary, in that order.
* Set config with Tn_VAL_SET_CNF bit
* Write to hidden accumulator, the value written is the time when the
first interrupt should be generated
* Write compartor with period interval for subsequent interrupts
(http://www.intel.com/hardwaredesign/hpetspec_1.pdf )
When emulating periodic timer with timers not capable of periodic
interrupt, driver is adding the period to counter value instead of
comparator value, which causes slow drift when using this emulation.
Also, driver seems to add hpetp->hp_delta both while setting up periodic
interrupt and while emulating periodic interrupts with timers not capable
of doing periodic interrupts. This hp_delta will result in slower than
expected interrupt rate and should not be used while setting the interval.
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Nils Carlson <nils.carlson@ericsson.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pass clocksource pointer to the read() callback for clocksources. This
allows us to share the callback between multiple instances.
[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm <damm@igel.co.jp>
Acked-by: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
hpet_calibrate() has a possibility of miss-calibration due to SMI. If SMI
interrupts in the while loop of calibration, then return value will be
big. This change calibrates until stabilizing by the return value with a
small value.
[akpm@linux-foundation.org: trivial style tweaks]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Acked-by: Vojtech Pavlik <vojtech@suse.cz>
Cc: Robert Picco <Robert.Picco@hp.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Impact: update documentation / help text
Original link is dead.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.
So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set. And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On x86 systems with CONFIG_HPET_TIMER enabled, when
the HPET driver (drivers/char/hpet.c) is loaded,
an incorrect busy message is printed when the driver
initializes since the HPET has already been allocated
by the core timer code. Remove the warning message.
Signed-off-by: David John <davidjon@xenontk.org>
Acked-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
fix:
On Thursday 31 July 2008, Ingo Molnar wrote:
> drivers/built-in.o: In function `hpet_alloc':
> : undefined reference to `__udivdi3'
> drivers/built-in.o: In function `hpet_alloc':
> : undefined reference to `__umoddi3'
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Minor /dev/hpet updates and bugfixes:
* Remove dead code, mostly remnants of an incomplete/unusable
kernel interface ... noted when addressing "sparse" warnings:
+ hpet_unregister() and a routine it calls
+ hpet_task and all references, including hpet_task_lock
+ hpet_data.hd_flags (and HPET_DATA_PLATFORM)
* Correct and improve boot message:
+ displays *counter* (shared between comparators) bit width,
not *timer* bit widths (which are often mixed)
+ relabel "timers" as "comparators"; this is less confusing,
they are not independent like normal timers are (sigh)
+ display MHz not Hz; it's never less than 10 MHz.
* Tighten and correct the userspace interface code
+ don't accidentally program comparators in 64-bit mode using
32-bit values ... always force comparators into 32-bit mode
+ provide the correct bit definition flagging comparators with
periodic capability ... the ABI is unchanged
* Update Documentation/hpet.txt
+ be more correct and current
+ expand description a bit
+ don't mention that now-gone kernel interface
Plus, add a FIXME comment for something that could cause big trouble
on systems with more capable HPETs than at least Intel seems to ship.
It seems that few folk use this userspace interface; it's not very
usable given the general lack of HPET IRQ routing. I'm told that
the only real point of it any more is to mmap for fast timestamps;
IMO that's handled better through the gettimeofday() vsyscall.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>