A number of situations currently require the heavyweight smp_mb(),
even though there is no need to order prior stores against later
loads. Many architectures have much cheaper ways to handle these
situations, but the Linux kernel currently has no portable way
to make use of them.
This commit therefore supplies smp_load_acquire() and
smp_store_release() to remedy this situation. The new
smp_load_acquire() primitive orders the specified load against
any subsequent reads or writes, while the new smp_store_release()
primitive orders the specifed store against any prior reads or
writes. These primitives allow array-based circular FIFOs to be
implemented without an smp_mb(), and also allow a theoretical
hole in rcu_assign_pointer() to be closed at no additional
expense on most architectures.
In addition, the RCU experience transitioning from explicit
smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
and rcu_assign_pointer(), respectively resulted in substantial
improvements in readability. It therefore seems likely that
replacing other explicit barriers with smp_load_acquire() and
smp_store_release() will provide similar benefits. It appears
that roughly half of the explicit barriers in core kernel code
might be so replaced.
[Changelog by PaulMck]
(cherry picked from commit 47933ad41a)
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move fiq_debugger into drivers/staging/android/fiq_debugger/ to
allow for sharing between ARM and ARM64.
Change-Id: I6ca5e8b7e3d000f57da3234260261c5592cef2a8
Signed-off-by: Colin Cross <ccross@android.com>
If more than one ETM or PTM are present, configure all of them
and enable the formatter in the ETB. This allows tracing on dual
core systems (e.g. omap4).
Change-Id: I028657d5cf2bee1b23f193d4387b607953b35888
Signed-off-by: Arve Hjønnevåg <arve@android.com>
The old code enabled data tracing, but did not configure the
range. We now configure it to trace all data addresses by default,
and add a trace_data_range attribute to change the range or disable
data tracing.
Change-Id: I9d04e3e1ea0d0b4d4d5bcb93b1b042938ad738b2
Signed-off-by: Arve Hjønnevåg <arve@android.com>
This patch implements CONFIG_DEBUG_RODATA, allowing
the kernel text section to be marked read-only in
order to catch bugs that write over the kernel. This
requires mapping the kernel code, plus up to 4MB, using
pages instead of sections, which can increase TLB
pressure.
The kernel is normally mapped using 1MB section entries
in the first level page table, and the first level page
table is copied into every mm. This prevents marking
the kernel text read-only, because the 1MB section
entries are too large granularity to separate the init
section, which is reused as read-write memory after
init, and the kernel text section. Also, the top level
page table for every process would need to be updated,
which is not possible to do safely and efficiently on SMP.
To solve both problems, allow alloc_init_pte to overwrite
an existing section entry with a fully-populated second
level page table. When CONFIG_DEBUG_RODATA is set, all
the section entries that overlap the kernel text section
will be replaced with page mappings. The kernel always
uses a pair of 2MB-aligned 1MB sections, so up to 2MB
of memory before and after the kernel may end up page
mapped.
When the top level page tables are copied into each
process the second level page tables are not copied,
leaving a single second level page table that will
affect all processes on all cpus. To mark a page
read-only, the second level page table is located using
the pointer in the first level page table for the
current process, and the supervisor RO bit is flipped
atomically. Once all pages have been updated, all TLBs
are flushed to ensure the changes are visible on all
cpus.
If CONFIG_DEBUG_RODATA is not set, the kernel will be
mapped using the normal 1MB section entries.
Change-Id: I94fae337f882c2e123abaf8e1082c29cd5d483c6
Signed-off-by: Colin Cross <ccross@android.com>
Based on a rough patch by frank.rowand@am.sony.com
Since ARM doesn't have an NMI (fiq's are not always available),
send an IPI to all other CPUs (current cpu prints the stack directly)
to capture a backtrace.
Change-Id: I8b163c8cec05d521b433ae133795865e8a33d4e2
Signed-off-by: Dima Zavin <dima@android.com>
ARM errata 727915 for PL310 has been updated to include a new
workaround required for PL310 r2p0 for l2x0_flush_all, which also
affects l2x0_clean_all in my testing. For r2p0, clean or flush
each set/way individually. For r3p0 or greater, use the debug
register for cleaning and flushing.
Requires exporting the cache_id, sets and ways detected in the
init function for later use.
Change-Id: I215055cbe5dc7e4e8184fb2befc4aff672ef0a12
Signed-off-by: Colin Cross <ccross@android.com>
This commit fixes the regression on Armada 370 (the kernal hang during
boot) introduced by the commit: "ARM: 7691/1: mm: kill unused
TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead".
When coming out of either a Wait for Interrupt (WFI) or a Wait for
Event (WFE) IDLE states, a specific timing sensitivity exists between
the retiring WFI/WFE instructions and the newly issued subsequent
instructions. This sensitivity can result in a CPU hang scenario. The
workaround is to insert either a Data Synchronization Barrier (DSB) or
Data Memory Barrier (DMB) command immediately after the WFI/WFE
instruction.
This commit was based on the work of Lior Amsalem, but heavily
modified to apply the errata fix dynamically according to the
processor type thanks to the suggestions of Russell King and Nicolas
Pitre.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Willy Tarreau <w@1wt.eu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The __cpu_logical_map array is statically initialized to 0, which is a valid
MPIDR value. To prevent issues with the current implementation, this patch
defines an MPIDR_INVALID value, and statically initializes the
__cpu_logical_map[] array to it. Entries in the arm_dt_init_cpu_maps()
tmp_map array used to stash DT reg properties while parsing DT are initialized
with the MPIDR_INVALID value as well for consistency.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>