Commit 5f893b2639 "tracing: Move enabling tracepoints to just after
rcu_init()" broke the enabling of system call events from the command
line. The reason was that the enabling of command line trace events
was moved before PID 1 started, and the syscall tracepoints require
that all tasks have the TIF_SYSCALL_TRACEPOINT flag set. But the
swapper task (pid 0) is not part of that. Since the swapper task is the
only task that is running at this early in boot, no task gets the
flag set, and the tracepoint never gets reached.
Instead of setting the swapper task flag (there should be no reason to
do that), re-enabled trace events again after the init thread (PID 1)
has been started. It requires disabling all command line events and
re-enabling them, as just enabling them again will not reset the logic
to set the TIF_SYSCALL_TRACEPOINT flag, as the syscall tracepoint will
be fooled into thinking that it was already set, and wont try setting
it again. For this reason, we must first disable it and re-enable it.
Link: http://lkml.kernel.org/r/1421188517-18312-1-git-send-email-mpe@ellerman.id.au
Link: http://lkml.kernel.org/r/20150115040506.216066449@goodmis.org
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
If the function graph tracer traces a jprobe callback, the system will
crash. This can easily be demonstrated by compiling the jprobe
sample module that is in the kernel tree, loading it and running the
function graph tracer.
# modprobe jprobe_example.ko
# echo function_graph > /sys/kernel/debug/tracing/current_tracer
# ls
The first two commands end up in a nice crash after the first fork.
(do_fork has a jprobe attached to it, so "ls" just triggers that fork)
The problem is caused by the jprobe_return() that all jprobe callbacks
must end with. The way jprobes works is that the function a jprobe
is attached to has a breakpoint placed at the start of it (or it uses
ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
will copy the stack frame and change the ip address to return to the
jprobe handler instead of the function. The jprobe handler must end
with jprobe_return() which swaps the stack and does an int3 (breakpoint).
This breakpoint handler will then put back the saved stack frame,
simulate the instruction at the beginning of the function it added
a breakpoint to, and then continue on.
For function tracing to work, it hijakes the return address from the
stack frame, and replaces it with a hook function that will trace
the end of the call. This hook function will restore the return
address of the function call.
If the function tracer traces the jprobe handler, the hook function
for that handler will not be called, and its saved return address
will be used for the next function. This will result in a kernel crash.
To solve this, pause function tracing before the jprobe handler is called
and unpause it before it returns back to the function it probed.
Some other updates:
Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
code look a bit cleaner and easier to understand (various tries to fix
this bug required this change).
Note, if fentry is being used, jprobes will change the ip address before
the function graph tracer runs and it will not be able to trace the
function that the jprobe is probing.
Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org
Cc: stable@vger.kernel.org # 2.6.30+
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Using just the filter for checking for trampolines or regs is not enough
when updating the code against the records that represent all functions.
Both the filter hash and the notrace hash need to be checked.
To trigger this bug (using trace-cmd and perf):
# perf probe -a do_fork
# trace-cmd start -B foo -e probe
# trace-cmd record -p function_graph -n do_fork sleep 1
The trace-cmd record at the end clears the filter before it disables
function_graph tracing and then that causes the accounting of the
ftrace function records to become incorrect and causes ftrace to bug.
Link: http://lkml.kernel.org/r/20150114154329.358378039@goodmis.org
Cc: stable@vger.kernel.org
[ still need to switch old_hash_ops to old_ops_hash ]
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
As the set_ftrace_filter affects both the function tracer as well as the
function graph tracer, the ops that represent each have a shared
ftrace_ops_hash structure. This allows both to be updated when the filter
files are updated.
But if function graph is enabled and the global_ops (function tracing) ops
is not, then it is possible that the filter could be changed without the
update happening for the function graph ops. This will cause the changes
to not take place and may even cause a ftrace_bug to occur as it could mess
with the trampoline accounting.
The solution is to check if the ops uses the shared global_ops filter and
if the ops itself is not enabled, to check if there's another ops that is
enabled and also shares the global_ops filter. In that case, the
modification still needs to be executed.
Link: http://lkml.kernel.org/r/20150114154329.055980438@goodmis.org
Cc: stable@vger.kernel.org # 3.17+
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull powerpc fixes from Michael Ellerman:
- Wire up sys_execveat(). Tested on 32 & 64 bit.
- Fix for kdump on LE systems with cpus hot unplugged.
- Revert Anton's fix for "kernel BUG at kernel/smpboot.c:134!", this
broke other platforms, we'll do a proper fix for 3.20.
* tag 'powerpc-3.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
Revert "powerpc: Secondary CPUs must set cpu_callin_map after setting active and online"
powerpc/kdump: Ignore failure in enabling big endian exception during crash
powerpc: Wire up sys_execveat() syscall
Pull ia64 fixlet from Tony Luck:
"Add execveat syscall"
* tag 'please-pull-syscall' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] Enable execveat syscall for ia64
Pull UML fixes from Richard Weinberger:
"Two fixes for UML regressions. Nothing exciting"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
x86, um: actually mark system call tables readonly
um: Skip futex_atomic_cmpxchg_inatomic() test
Commit a074335a37 ("x86, um: Mark system call tables readonly") was
supposed to mark the sys_call_table in UML as RO by adding the const,
but it doesn't have the desired effect as it's nevertheless being placed
into the data section since __cacheline_aligned enforces sys_call_table
being placed into .data..cacheline_aligned instead. We need to use
the ____cacheline_aligned version instead to fix this issue.
Before:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev
0000000000000000 D sys_call_table
0000000000000000 D syscall_table_size
After:
$ nm -v arch/x86/um/sys_call_table_64.o | grep -1 "sys_call_table"
U sys_writev
0000000000000000 R sys_call_table
0000000000000000 D syscall_table_size
Fixes: a074335a37 ("x86, um: Mark system call tables readonly")
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
futex_atomic_cmpxchg_inatomic() does not work on UML because
it triggers a copy_from_user() in kernel context.
On UML copy_from_user() can only be used if the kernel was called
by a real user space process such that UML can use ptrace()
to fetch the value.
Reported-by: Miklos Szeredi <miklos@szeredi.hu>
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
Tested-by: Daniel Walter <d.walter@0x90.at>
Pull SCSI fixes from James Bottomley:
"This is a set of three fixes: one to correct an abort path thinko
causing failures (and a panic) in USB on device misbehaviour, One to
fix an out of order issue in the fnic driver and one to match discard
expectations to qemu which otherwise cause Linux to behave badly as a
guest"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
SCSI: fix regression in scsi_send_eh_cmnd()
fnic: IOMMU Fault occurs when IO and abort IO is out of order
sd: tweak discard heuristics to work around QEMU SCSI issue
Pull sound fixes from Takashi Iwai:
"Nothing too exciting as a new year's start here: most of fixes are for
ASoC, a boot crash fix on OMAP for deferred probe, a few driver
specific fixes (Intel, dwc, rockchip, rt5677), in addition to typo
fixes in kerneldoc comments for PCM"
* tag 'sound-3.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: Fix kerneldoc for params_*() functions
ASoC: rockchip: i2s: fix maxburst of dma data to 4
ASoC: rockchip: i2s: fix error defination of transmit data level
ASoC: Intel: correct the fixed free block allocation
ASoC: rt5677: fixed rt5677_dsp_vad_put rt5677_dsp_vad_get panic
ASoC: Intel: Fix BYTCR machine driver MODULE_ALIAS
ASoC: Intel: Fix BYTCR firmware name
ASoC: dwc: Iterate over all channels
ASoC: dwc: Ensure FIFOs are flushed to prevent channel swap
ASoC: Intel: Add I2C dependency to two new machines
ASoC: dapm: Remove snd_soc_of_parse_audio_routing() due to deferred probe
Pull vhost cleanup and virtio bugfix
"There's a single change here, fixing a vhost bug where vhost
initialization fails due to used ring alignment check being too
strict"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vhost: relax used address alignment
virtio_ring: document alignment requirements
Pull audit fix from Paul Moore:
"One audit patch to resolve a panic/oops when recording filenames in
the audit log, see the mail archive link below.
The fix isn't as nice as I would like, as it involves an allocate/copy
of the filename, but it solves the problem and the overhead should
only affect users who have configured audit rules involving file
names.
We'll revisit this issue with future kernels in an attempt to make
this suck less, but in the meantime I think this fix should go into
the next release of v3.19-rcX.
[ https://marc.info/?t=141986927600001&r=1&w=2 ]"
* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: create private file name copies when auditing inodes
Pull arch/nios2 fixes from Ley Foon Tan:
- fix compilation error when enable CONFIG_PREEMPT
- initialize cpuinfo.mmu variable supplied by the device tree
* tag 'nios2-fixes-v3.19-rc3' of git://git.rocketboards.org/linux-socfpga-next:
nios2: Use preempt_schedule_irq
nios2: Initialize cpuinfo.mmu
Pull crypto fix from Herbert Xu:
"Fix a use-after-free crash in the user-space crypto API"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: af_alg - fix backlog handling
Follow aa0d532605 ("ia64: Use preempt_schedule_irq") and use
preempt_schedule_irq instead of enabling/disabling interrupts and
messing around with PREEMPT_ACTIVE in the nios2 low-level preemption
code ourselves. Also get rid of the now needless re-check for
TIF_NEED_RESCHED, preempt_schedule_irq will already take care of
rescheduling.
This also fixes the following build error when building with
CONFIG_PREEMPT:
arch/nios2/kernel/built-in.o: In function `need_resched':
arch/nios2/kernel/entry.S:374: undefined reference to `PREEMPT_ACTIVE'
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Ley Foon Tan <lftan@altera.com>
This patch initializes the mmu field of the cpuinfo structure to the
value supplied by the devicetree.
Signed-off-by: Walter Goossens <waltergoossens@home.nl>
Acked-by: Ley Foon Tan <lftan@altera.com>
Pull ARM SoC fixes from Arnd Bergmann:
"A very small set of fixes for 3.19, as everyone was out.
The clocksource patch was something I missed for the merge window
after the change that broke arm64 was merged through arm-soc. The
other two patches are a fix for an undetected merge problem in mvebu
and a defconfig change to make some exynos boards work with the normal
multi_v7_defconfig"
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
Add USB_EHCI_EXYNOS to multi_v7_defconfig
ARM: mvebu: Fix pinctrl configuration for Armada 370 DB
clocksource: arch_timer: Only use the virtual counter (CNTVCT) on arm64
Pull fbdev fixes from Tomi Valkeinen:
- Fix regression with Nokia N900 display
- Fix crash on fbdev using freed __initdata logos
- Fix fb_deferred_io_fsync() return value.
* tag 'fbdev-fixes-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
OMAPDSS: SDI: fix output port_num
video/fbdev: fix defio's fsync
video/logo: prevent use of logos after they have been freed
OMAPDSS: pll: NULL dereference in error handling
OMAPDSS: HDMI: remove double initializer entries
Pull input layer fixes from Dmitry Torokhov:
"Fixes for v7 protocol for ALPS devices and few other driver fixes.
Also users can request input events to be stamped with boot time
timestamps, in addition to real and monotonic timestamps"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: hil_kbd - fix incorrect use of init_completion
Input: alps - v7: document the v7 touchpad packet protocol
Input: alps - v7: fix finger counting for > 2 fingers on clickpads
Input: alps - v7: sometimes a single touch is reported in mt[1]
Input: alps - v7: ignore new packets
Input: evdev - add CLOCK_BOOTTIME support
Input: psmouse - expose drift duration for IBM trackpoints
Input: stmpe - bias keypad columns properly
Input: stmpe - enforce device tree only mode
mfd: stmpe: add pull up/down register offsets for STMPE
Input: optimize events_per_packet count calculation
Input: edt-ft5x06 - fixed a macro coding style issue
Input: gpio_keys - replace timer and workqueue with delayed workqueue
Input: gpio_keys - allow separating gpio and irq in device tree