There was a a bug in setup_new_exec(), whereby
the test to disabled perf monitoring was not
correct because the new credentials for the
process were not yet committed and therefore
the get_dumpable() test was never firing.
The patch fixes the problem by moving the
perf_event test until after the credentials
are committed.
Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
trinity fuzzer triggered WARN_ONCE("Can't find any breakpoint
slot") in arch_install_hw_breakpoint() but the problem is not
arch-specific.
The problem is, task_bp_pinned(cpu) checks "cpu == iter->cpu"
but this doesn't account the "all cpus" events with iter->cpu <
0.
This means that, say, register_user_hw_breakpoint(tsk) can
happily create the arbitrary number > HBP_NUM of breakpoints
which can not be activated. toggle_bp_task_slot() is equally
wrong by the same reason and nr_task_bp_pinned[] can have
negative entries.
Simple test:
# perl -e 'sleep 1 while 1' &
# perf record -e mem:0x10,mem:0x10,mem:0x10,mem:0x10,mem:0x10 -p `pidof perl`
Before this patch this triggers the same problem/WARN_ON(),
after the patch it correctly fails with -ENOSPC.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20130620155006.GA6324@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch fixes broken support of PEBS-LL on SNB-EP/IVB-EP.
For some reason, the LDLAT extra reg definition for snb_ep
showed up as duplicate in the snb table.
This patch moves the definition of LDLAT back into the
snb_ep table.
Thanks to Don Zickus for tracking this one down.
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130607212210.GA11849@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince's fuzzer once again found holes. This time it spotted a leak in
the locked page accounting.
When an event had redirected output and its close() was the last
reference to the buffer we didn't have a vm context to undo accounting.
Change the code to destroy the buffer on the last munmap() and detach
all redirected events at that time. This provides us the right context
to undo the vm accounting.
Reported-and-tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20130604084421.GI8923@twins.programming.kicks-ass.net
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Vince reported a problem found by his perf specific trinity
fuzzer.
Al noticed 2 problems with perf's mmap():
- it has issues against fork() since we use vma->vm_mm for accounting.
- it has an rb refcount leak on double mmap().
We fix the issues against fork() by using VM_DONTCOPY; I don't
think there's code out there that uses this; we didn't hear
about weird accounting problems/crashes. If we do need this to
work, the previously proposed VM_PINNED could make this work.
Aside from the rb reference leak spotted by Al, Vince's example
prog was indeed doing a double mmap() through the use of
perf_event_set_output().
This exposes another problem, since we now have 2 events with
one buffer, the accounting gets screwy because we account per
event. Fix this by making the buffer responsible for its own
accounting.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20130528085548.GA12193@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Fix to free gone and unused optprobes. This bug will
cause a kernel panic if the user reuses the killed and
unused probe.
Reported at:
http://sourceware.org/ml/systemtap/2013-q2/msg00142.html
In the normal path, an optprobe on an init function is
unregistered when a module goes live.
unregister_kprobe(kp)
-> __unregister_kprobe_top
->__disable_kprobe
->disarm_kprobe(ap == op)
->__disarm_kprobe
->unoptimize_kprobe : the op is queued
on unoptimizing_list
and do nothing in __unregister_kprobe_bottom
After a while (usually wait 5 jiffies), kprobe_optimizer
runs to unoptimize and free optprobe.
kprobe_optimizer
->do_unoptimize_kprobes
->arch_unoptimize_kprobes : moved to free_list
->do_free_cleaned_kprobes
->hlist_del: the op is removed
->free_aggr_kprobe
->arch_remove_optimized_kprobe
->arch_remove_kprobe
->kfree: the op is freed
Here, if kprobes_module_callback is called and the delayed
unoptimizing probe is picked BEFORE kprobe_optimizer runs,
kprobes_module_callback
->kill_kprobe
->kill_optimized_kprobe : dequeued from unoptimizing_list <=!!!
->arch_remove_optimized_kprobe
->arch_remove_kprobe
(but op is not freed, and on the kprobe hash table)
This doesn't happen if the probe unregistration is done AFTER
kprobes_module_callback is called (because at that time the op
is gone), and kprobe-tracer does it.
To fix this bug, this patch changes kprobes_module_callback to
enqueue the op to freeing_list at kill_optimized_kprobe only
if the op is unused. The unused probes on freeing_list will
be freed in do_free_cleaned_kprobes.
Note that this calls arch_remove_*kprobe twice on the
same probe. Thus those functions have to check the double free.
Fortunately, most of arch codes already checked that except
for mips. This will be fixed in the next patch.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Timo Juhani Lindfors <timo.lindfors@iki.fi>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: systemtap@sourceware.org
Cc: yrl.pp-manager.tt@hitachi.com
Cc: David S. Miller <davem@davemloft.net>
Cc: "David S. Miller" <davem@davemloft.net>
Link: http://lkml.kernel.org/r/20130522093409.9084.63554.stgit@mhiramat-M0-7522
[ Minor edits. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
do_smart_update_queue() is called when an operation (semop,
semctl(SETVAL), semctl(SETALL), ...) modified the array. It must check
which of the sleeping tasks can proceed.
do_smart_update_queue() missed a few wakeups:
- if a sleeping complex op was completed, then all per-semaphore queues
must be scanned - not only those that were modified by *sops
- if a sleeping simple op proceeded, then the global queue must be
scanned again
And:
- the test for "|sops == NULL) before scanning the global queue is not
required: If the global queue is empty, then it doesn't need to be
scanned - regardless of the reason for calling do_smart_update_queue()
The patch is not optimized, i.e. even completing a wait-for-zero
operation causes a rescan. This is done to keep the patch as simple as
possible.
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull NFS client bugfixes from Trond Myklebust:
- Stable fix to prevent an rpc_task wakeup race
- Fix a NFSv4.1 session drain deadlock
- Fix a NFSv4/v4.1 mount regression when not running rpc.gssd
- Ensure auth_gss pipe detection works in namespaces
- Fix SETCLIENTID fallback if rpcsec_gss is not available
* tag 'nfs-for-3.10-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix SETCLIENTID fallback if GSS is not available
SUNRPC: Prevent an rpc_task wakeup race
NFSv4.1 Fix a pNFS session draining deadlock
SUNRPC: Convert auth_gss pipe detection to work in namespaces
SUNRPC: Faster detection if gssd is actually running
SUNRPC: Fix a bug in gss_create_upcall
Pull parisc fixes from Helge Deller:
"This time we made the kernel- and interruption stack allocation
reentrant which fixed some strange kernel crashes (specifically
protection ID traps).
Furthemore this patchset fixes the interrupt stack in UP and SMP
configurations by using native locking instructions. And finally
usage of floating point calculations on parisc were disabled in the
MPILIB."
* 'parisc-for-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix irq stack on UP and SMP
parisc/superio: Use module_pci_driver to register driver
parisc: make interrupt and interruption stack allocation reentrant
parisc: show number of FPE and unaligned access handler calls in /proc/interrupts
parisc: add additional parisc git tree to MAINTAINERS file
parisc: use PAGE_SHIFT instead of hardcoded value 12 in pacache.S
parisc: add rp5470 entry to machine database
MPILIB: disable usage of floating point registers on parisc
Pull xfs fixes from Ben Myers:
"Here are fixes for corruption on 512 byte filesystems, a rounding
error, a use-after-free, some flags to fix lockdep reports, and
several fixes related to CRCs. We have a somewhat larger post -rc1
queue than usual due to fixes related to the CRC feature we merged for
3.10:
- Fix for corruption with FSX on 512 byte blocksize filesystems
- Fix rounding error in xfs_free_file_space
- Fix use-after-free with extent free intents
- Add several missing KM_NOFS flags to fix lockdep reports
- Several fixes for CRC related code"
* tag 'for-linus-v3.10-rc3' of git://oss.sgi.com/xfs/xfs:
xfs: remote attribute lookups require the value length
xfs: xfs_attr_shortform_allfit() does not handle attr3 format.
xfs: xfs_da3_node_read_verify() doesn't handle XFS_ATTR3_LEAF_MAGIC
xfs: fix missing KM_NOFS tags to keep lockdep happy
xfs: Don't reference the EFI after it is freed
xfs: fix rounding in xfs_free_file_space
xfs: fix sub-page blocksize data integrity writes
Pull power management and ACPI fixes from Rafael Wysocki:
- Additional CPU ID for the intel_pstate driver from Dirk Brandewie.
- More cpufreq fixes related to ARM big.LITTLE support and locking from
Viresh Kumar.
- VIA C7 cpufreq build fix from Rafał Bilski.
- ACPI power management fix making it possible to use device power
states regardless of the CONFIG_PM setting from Rafael J Wysocki.
- New ACPI video blacklist item from Bastian Triller.
* tag 'pm+acpi-3.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / video: Add "Asus UL30A" to ACPI video detect blacklist
cpufreq: arm_big_little_dt: Instantiate as platform_driver
cpufreq: arm_big_little_dt: Register driver only if DT has valid data
cpufreq / e_powersaver: Fix linker error when ACPI processor is a module
cpufreq / intel_pstate: Add additional supported CPU ID
cpufreq: Drop rwsem lock around CPUFREQ_GOV_POLICY_EXIT
ACPI / PM: Allow device power states to be used for CONFIG_PM unset
Pull slave-dma fixes from Vinod Koul:
"We have two patches from Andy & Rafael fixing the Lynxpoint dma"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
ACPI / LPSS: register clock device for Lynxpoint DMA properly
dma: acpi-dma: parse CSRT to extract additional resources
kcore_vmalloc is in fs/proc/kcore.c and kcore_mem is unused across
the tree. Noticed while grepping the tree for some other kcore stuff.
(score looks pretty unmaintained to me.)
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ARC fixes from Vineet Gupta:
- Fallouts/wreckage of Cache Flush optimizations / aliasing dcache
support
- Fix for an interesting bug where piped input to grep was getting
mysteriously clobbered
* tag 'arc-v3.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: lazy dcache flush broke gdb in non-aliasing configs
ARC: Use enough bits for determining page's cache color
ARC: Brown paper bag bug in macro for checking cache color
ARC: copy_(to|from)_user() to honor usermode-access permissions
ARC: [mm] Prevent stray dcache lines after__sync_icache_dcach()
ARC: [TB10x] Remove redundant abilis,simple-pinctrl mechanism
Pull ARM fixes from Russell King:
"Just three this time, all really quite small"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7729/1: vfp: ensure VFP_arch is non-zero when VFP is not supported
ARM: 7727/1: remove the .vm_mm value from gate_vma
ARM: 7723/1: crypto: sha1-armv4-large.S: fix SP handling
gdbserver inserting a breakpoint ends up calling copy_user_page() for a
code page. The generic version of which (non-aliasing config) didn't set
the PG_arch_1 bit hence update_mmu_cache() didn't sync dcache/icache for
corresponding dynamic loader code page - causing garbade to be executed.
So now aliasing versions of copy_user_highpage()/clear_page() are made
default. There is no significant overhead since all of special alias
handling code is compiled out for non-aliasing build
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Merge fixes from Andrew Morton:
"A bunch of fixes and one simple fbdev driver which missed the merge
window because people will still talking about it (to no great
effect)."
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (30 commits)
aio: fix kioctx not being freed after cancellation at exit time
mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas
drivers/rtc/rtc-max8998.c: check for pdata presence before dereferencing
ocfs2: goto out_unlock if ocfs2_get_clusters_nocache() failed in ocfs2_fiemap()
random: fix accounting race condition with lockless irq entropy_count update
drivers/char/random.c: fix priming of last_data
mm/memory_hotplug.c: fix printk format warnings
nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary
drivers/block/brd.c: fix brd_lookup_page() race
fbdev: FB_GOLDFISH should depend on HAS_DMA
drivers/rtc/rtc-pl031.c: pass correct pointer to free_irq()
auditfilter.c: fix kernel-doc warnings
aio: fix io_getevents documentation
revert "selftest: add simple test for soft-dirty bit"
drivers/leds/leds-ot200.c: fix error caused by shifted mask
mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer
linux/kernel.h: fix kernel-doc warning
mm compaction: fix of improper cache flush in migration code
rapidio/tsi721: fix bug in MSI interrupt handling
hfs: avoid crash in hfs_bnode_create
...
Pull ARM SoC fixes from Olof Johansson:
"We didn't have any fixes sent up for -rc2, so this is a slightly
larger batch. A bit all over the place platform-wise; OMAP, at91,
marvell, renesas, sunxi, ux500, etc.
I tried to summarize highlights but there isn't a whole lot to point
out. Lots of little things fixed all over. A couple of defconfig
updates due to new/changing options."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (44 commits)
ARM: at91/sama5: fix incorrect PMC pcr div definition
ARM: at91/dt: fix macb pinctrl_macb_rmii_mii_alt definition
ARM: at91: at91sam9n12: move external irq declatation to DT
ARM: shmobile: marzen: Use error values in usb_power_*
ARM: tegra: defconfig fixes
ARM: nomadik: fix IRQ assignment for SMC ethernet
ARM: vt8500: Add missing NULL terminator in dt_compat
clk: tegra: add ac97 controller clock
clk: tegra: remove USB from clk init table
ARM: dts: mvebu: Fix wrong the address reg value for the L2-cache node
ARM: plat-orion: Fix num_resources and id for ge10 and ge11
ARM: OMAP2+: hwmod: Remove sysc slave idle and auto idle apis
SERIAL: OMAP: Remove the slave idle handling from the driver
ARM: OMAP2+: serial: Remove the un-used slave idle hooks
ARM: OMAP2+: hwmod-data: UART IP needs software control to manage sidle modes
ARM: OMAP2+: hwmod: Add a new flag to handle SIDLE in SWSUP only in active
ARM: OMAP2+: hwmod: Fix sidle programming in _enable_sysc()/_idle_sysc()
arm: mvebu: fix the 'ranges' property to handle PCIe
ARM: mvebu: select ARCH_REQUIRE_GPIOLIB for mvebu platform
ARM: AM33XX: Add missing .clkdm_name to clkdiv32k_ick clock
...
The recent changes overhauling fs/aio.c introduced a bug that results in
the kioctx not being freed when outstanding kiocbs are cancelled at
exit_aio() time. Specifically, a kiocb that is cancelled has its
completion events discarded by batch_complete_aio(), which then fails to
wake up the process stuck in free_ioctx(). Fix this by modifying the
wait_event() condition in free_ioctx() appropriately.
This patch was tested with the cancel operation in the thread based code
posted yesterday.
[akpm@linux-foundation.org: fix build]
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Kent Overstreet <koverstreet@google.com>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Zach Brown <zab@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A panic can be caused by simply cat'ing /proc/<pid>/smaps while an
application has a VM_PFNMAP range. It happened in-house when a
benchmarker was trying to decipher the memory layout of his program.
/proc/<pid>/smaps and similar walks through a user page table should not
be looking at VM_PFNMAP areas.
Certain tests in walk_page_range() (specifically split_huge_page_pmd())
assume that all the mapped PFN's are backed with page structures. And
this is not usually true for VM_PFNMAP areas. This can result in panics
on kernel page faults when attempting to address those page structures.
There are a half dozen callers of walk_page_range() that walk through a
task's entire page table (as N. Horiguchi pointed out). So rather than
change all of them, this patch changes just walk_page_range() to ignore
VM_PFNMAP areas.
The logic of hugetlb_vma() is moved back into walk_page_range(), as we
want to test any vma in the range.
VM_PFNMAP areas are used by:
- graphics memory manager gpu/drm/drm_gem.c
- global reference unit sgi-gru/grufile.c
- sgi special memory char/mspec.c
- and probably several out-of-tree modules
[akpm@linux-foundation.org: remove now-unused hugetlb_vma() stub]
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: David Sterba <dsterba@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>