* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
jump label: Add work around to i386 gcc asm goto bug
x86, ftrace: Use safe noops, drop trap test
jump_label: Fix unaligned traps on sparc.
jump label: Make arch_jump_label_text_poke_early() optional
jump label: Fix error with preempt disable holding mutex
oprofile: Remove deprecated use of flush_scheduled_work()
oprofile: Fix the hang while taking the cpu offline
jump label: Fix deadlock b/w jump_label_mutex vs. text_mutex
jump label: Fix module __init section race
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Check irq_remapped instead of remapping_enabled in destroy_irq()
flush_scheduled_work() is deprecated and scheduled to be removed.
sync_stop() currently cancels cpu_buffer works inside buffer_mutex and
flushes the system workqueue outside. Instead, split end_cpu_work()
into two parts - stopping further work enqueues and flushing works -
and do the former inside buffer_mutex and latter outside.
For stable kernels v2.6.35.y and v2.6.36.y.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
The kernel build with CONFIG_OPROFILE and CPU_HOTPLUG enabled.
The oprofile is initialised using system timer in absence of hardware
counters supports. Oprofile isn't started from userland.
In this setup while doing a CPU offline the kernel hangs in infinite
for loop inside lock_hrtimer_base() function
This happens because as part of oprofile_cpu_notify(, it tries to
stop an hrtimer which was never started. These per-cpu hrtimers
are started when the oprfile is started.
echo 1 > /dev/oprofile/enable
This problem also existwhen the cpu is booted with maxcpus parameter
set. When bringing the remaining cpus online the timers are started
even if oprofile is not yet enabled.
This patch fix this issue by adding a state variable so that
these hrtimer start/stop is only attempted when oprofile is
started
For stable kernels v2.6.35.y and v2.6.36.y.
Reported-by: Jan Sebastien <s-jan@ti.com>
Tested-by: sricharan <r.sricharan@ti.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>
Instead of always assigning an increasing inode number in new_inode
move the call to assign it into those callers that actually need it.
For now callers that need it is estimated conservatively, that is
the call is added to all filesystems that do not assign an i_ino
by themselves. For a few more filesystems we can avoid assigning
any inode number given that they aren't user visible, and for others
it could be done lazily when an inode number is actually needed,
but that's left for later patches.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
vfs: make no_llseek the default
vfs: don't use BKL in default_llseek
llseek: automatically add .llseek fop
libfs: use generic_file_llseek for simple_attr
mac80211: disallow seeks in minstrel debug code
lirc: make chardev nonseekable
viotape: use noop_llseek
raw: use explicit llseek file operations
ibmasmfs: use generic_file_llseek
spufs: use llseek in all file operations
arm/omap: use generic_file_llseek in iommu_debug
lkdtm: use generic_file_llseek in debugfs
net/wireless: use generic_file_llseek in debugfs
drm: use noop_llseek
All file_operations should get a .llseek operation so we can make
nonseekable_open the default for future file operations without a
.llseek pointer.
The three cases that we can automatically detect are no_llseek, seq_lseek
and default_llseek. For cases where we can we can automatically prove that
the file offset is always ignored, we use noop_llseek, which maintains
the current behavior of not returning an error from a seek.
New drivers should normally not use noop_llseek but instead use no_llseek
and call nonseekable_open at open time. Existing drivers can be converted
to do the same when the maintainer knows for certain that no user code
relies on calling seek on the device file.
The generated code is often incorrectly indented and right now contains
comments that clarify for each added line why a specific variant was
chosen. In the version that gets submitted upstream, the comments will
be gone and I will manually fix the indentation, because there does not
seem to be a way to do that using coccinelle.
Some amount of new code is currently sitting in linux-next that should get
the same modifications, which I will do at the end of the merge window.
Many thanks to Julia Lawall for helping me learn to write a semantic
patch that does all this.
===== begin semantic patch =====
// This adds an llseek= method to all file operations,
// as a preparation for making no_llseek the default.
//
// The rules are
// - use no_llseek explicitly if we do nonseekable_open
// - use seq_lseek for sequential files
// - use default_llseek if we know we access f_pos
// - use noop_llseek if we know we don't access f_pos,
// but we still want to allow users to call lseek
//
@ open1 exists @
identifier nested_open;
@@
nested_open(...)
{
<+...
nonseekable_open(...)
...+>
}
@ open exists@
identifier open_f;
identifier i, f;
identifier open1.nested_open;
@@
int open_f(struct inode *i, struct file *f)
{
<+...
(
nonseekable_open(...)
|
nested_open(...)
)
...+>
}
@ read disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ read_no_fpos disable optional_qualifier exists @
identifier read_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
{
... when != off
}
@ write @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
expression E;
identifier func;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
<+...
(
*off = E
|
*off += E
|
func(..., off, ...)
|
E = *off
)
...+>
}
@ write_no_fpos @
identifier write_f;
identifier f, p, s, off;
type ssize_t, size_t, loff_t;
@@
ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
{
... when != off
}
@ fops0 @
identifier fops;
@@
struct file_operations fops = {
...
};
@ has_llseek depends on fops0 @
identifier fops0.fops;
identifier llseek_f;
@@
struct file_operations fops = {
...
.llseek = llseek_f,
...
};
@ has_read depends on fops0 @
identifier fops0.fops;
identifier read_f;
@@
struct file_operations fops = {
...
.read = read_f,
...
};
@ has_write depends on fops0 @
identifier fops0.fops;
identifier write_f;
@@
struct file_operations fops = {
...
.write = write_f,
...
};
@ has_open depends on fops0 @
identifier fops0.fops;
identifier open_f;
@@
struct file_operations fops = {
...
.open = open_f,
...
};
// use no_llseek if we call nonseekable_open
////////////////////////////////////////////
@ nonseekable1 depends on !has_llseek && has_open @
identifier fops0.fops;
identifier nso ~= "nonseekable_open";
@@
struct file_operations fops = {
... .open = nso, ...
+.llseek = no_llseek, /* nonseekable */
};
@ nonseekable2 depends on !has_llseek @
identifier fops0.fops;
identifier open.open_f;
@@
struct file_operations fops = {
... .open = open_f, ...
+.llseek = no_llseek, /* open uses nonseekable */
};
// use seq_lseek for sequential files
/////////////////////////////////////
@ seq depends on !has_llseek @
identifier fops0.fops;
identifier sr ~= "seq_read";
@@
struct file_operations fops = {
... .read = sr, ...
+.llseek = seq_lseek, /* we have seq_read */
};
// use default_llseek if there is a readdir
///////////////////////////////////////////
@ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier readdir_e;
@@
// any other fop is used that changes pos
struct file_operations fops = {
... .readdir = readdir_e, ...
+.llseek = default_llseek, /* readdir is present */
};
// use default_llseek if at least one of read/write touches f_pos
/////////////////////////////////////////////////////////////////
@ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read.read_f;
@@
// read fops use offset
struct file_operations fops = {
... .read = read_f, ...
+.llseek = default_llseek, /* read accesses f_pos */
};
@ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write.write_f;
@@
// write fops use offset
struct file_operations fops = {
... .write = write_f, ...
+ .llseek = default_llseek, /* write accesses f_pos */
};
// Use noop_llseek if neither read nor write accesses f_pos
///////////////////////////////////////////////////////////
@ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
identifier write_no_fpos.write_f;
@@
// write fops use offset
struct file_operations fops = {
...
.write = write_f,
.read = read_f,
...
+.llseek = noop_llseek, /* read and write both use no f_pos */
};
@ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier write_no_fpos.write_f;
@@
struct file_operations fops = {
... .write = write_f, ...
+.llseek = noop_llseek, /* write uses no f_pos */
};
@ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
identifier read_no_fpos.read_f;
@@
struct file_operations fops = {
... .read = read_f, ...
+.llseek = noop_llseek, /* read uses no f_pos */
};
@ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
identifier fops0.fops;
@@
struct file_operations fops = {
...
+.llseek = noop_llseek, /* no read or write fn */
};
===== End semantic patch =====
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Julia Lawall <julia@diku.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Commit e9677b3ce (oprofile, ARM: Use oprofile_arch_exit() to
cleanup on failure) caused oprofile_perf_exit to be called
in the cleanup path of oprofile_perf_init. The __exit tag
for oprofile_perf_exit should therefore be dropped.
The same has to be done for exit_driverfs as well, as this
function is called from oprofile_perf_exit. Else, we get
the following two linker errors.
LD .tmp_vmlinux1
`oprofile_perf_exit' referenced in section `.init.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
LD .tmp_vmlinux1
`exit_driverfs' referenced in section `.text' of arch/arm/oprofile/built-in.o: defined in discarded section `.exit.text' of arch/arm/oprofile/built-in.o
make: *** [.tmp_vmlinux1] Error 1
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
oprofile_perf.c needs to include platform_device.h
Otherwise we get the following build break.
CC arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.o
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: 'struct platform_device' declared inside parameter list
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:192: warning: its scope is only this definition or declaration, which is probably not what you want
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:201: warning: 'struct platform_device' declared inside parameter list
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:210: error: variable 'oprofile_driver' has initializer but incomplete type
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: unknown field 'driver' specified in initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: extra brace group at end of initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:211: error: (near initialization for 'oprofile_driver')
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: excess elements in struct initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:213: warning: (near initialization for 'oprofile_driver')
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: error: unknown field 'resume' specified in initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: excess elements in struct initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:214: warning: (near initialization for 'oprofile_driver')
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: error: unknown field 'suspend' specified in initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: excess elements in struct initializer
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c:215: warning: (near initialization for 'oprofile_driver')
arch/arm/oprofile/../../../drivers/oprofile/oprofile_perf.c: In function 'init_driverfs':
Signed-off-by: Anand Gadiyar <gadiyar@ti.com>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Oprofile counters are setup when profiling is disabled. Thus, writing
to oprofilefs has no immediate effect. Changes are updated only after
oprofile is reenabled.
To keep userland and kernel states synchronized, we now allow
configuration of oprofile only if profiling is disabled. In this case
it checks if the profiler is running and then disables write access to
oprofilefs by returning -EBUSY. The change should be backward
compatible with current oprofile userland daemon.
Acked-by: Maynard Johnson <maynardj@us.ibm.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
There is duplicate cleanup code in the init and exit functions. Now,
oprofile_arch_exit() is also used if oprofile_arch_init() fails.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch simplifies op_create_counter(). Removing if/else if paths
and return code variable by direct returning from function.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Move the perf-events backend from arch/arm/oprofile into
drivers/oprofile so that the code can be shared between architectures.
This allows each architecture to maintain only a single copy of the PMU
accessor functions instead of one for both perf and OProfile. It also
becomes possible for other architectures to delete much of their
OProfile code in favour of the common code now available in
drivers/oprofile/oprofile_perf.c.
Signed-off-by: Matt Fleming <matt@console-pimps.org>
Tested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Removing duplicate code by assigning the inodes private data pointer
in __oprofilefs_create_file(). Extending the function interface to
pass the pointer.
Signed-off-by: Robert Richter <robert.richter@amd.com>
oprofile_init calls oprofile_arch_init to initialise the architecture-specific
backend code. If this backend code returns failure, oprofile_arch_exit is
called immediately, making it difficult to allocate and free resources
correctly.
This patch removes the oprofile_arch_exit call from oprofile_init,
meaning that all architectures must ensure that oprofile_arch_init
cleans up any mess it's made before returning an error. As far as
I can tell, this only affects the code for ARM.
Cc: Robert Richter <robert.richter@amd.com>
Cc: Matt Fleming <matt@console-pimps.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
This patch fixes a crash during shutdown reported below. The crash is
caused by accessing already freed task structs. The fix changes the
order for registering and unregistering notifier callbacks.
All notifiers must be initialized before buffers start working. To
stop buffer synchronization we cancel all workqueues, unregister the
notifier callback and then flush all buffers. After all of this we
finally can free all tasks listed.
This should avoid accessing freed tasks.
On 22.07.10 01:14:40, Benjamin Herrenschmidt wrote:
> So the initial observation is a spinlock bad magic followed by a crash
> in the spinlock debug code:
>
> [ 1541.586531] BUG: spinlock bad magic on CPU#5, events/5/136
> [ 1541.597564] Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6d03
>
> Backtrace looks like:
>
> spin_bug+0x74/0xd4
> ._raw_spin_lock+0x48/0x184
> ._spin_lock+0x10/0x24
> .get_task_mm+0x28/0x8c
> .sync_buffer+0x1b4/0x598
> .wq_sync_buffer+0xa0/0xdc
> .worker_thread+0x1d8/0x2a8
> .kthread+0xa8/0xb4
> .kernel_thread+0x54/0x70
>
> So we are accessing a freed task struct in the work queue when
> processing the samples.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: stable@kernel.org
Signed-off-by: Robert Richter <robert.richter@amd.com>