linux

mirror of https://github.com/armbian/linux.git synced 2026-01-06 10:13:00 -08:00

Author	SHA1	Message	Date
Olof Johansson	e306dfd06f	ARM64: unwind: Fix PC calculation The frame PC value in the unwind code used to just take the saved LR value and use that. That's incorrect as a stack trace, since it shows the return path stack, not the call path stack. In particular, it shows faulty information in case the bl is done as the very last instruction of one label, since the return point will be in the next label. That can easily be seen with tail calls to panic(), which is marked __noreturn and thus doesn't have anything useful after it. Easiest here is to just correct the unwind code and do a -4, to get the actual call site for the backtrace instead of the return site. Signed-off-by: Olof Johansson <olof@lixom.net> Cc: stable@vger.kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-17 09:16:33 +00:00
Will Deacon	8e86f0b409	arm64: atomics: fix use of acquire + release for full barrier semantics Linux requires a number of atomic operations to provide full barrier semantics, that is no memory accesses after the operation can be observed before any accesses up to and including the operation in program order. On arm64, these operations have been incorrectly implemented as follows: // A, B, C are independent memory locations <Access [A]> // atomic_op (B) 1: ldaxr x0, [B] // Exclusive load with acquire <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b <Access [C]> The assumption here being that two half barriers are equivalent to a full barrier, so the only permitted ordering would be A -> B -> C (where B is the atomic operation involving both a load and a store). Unfortunately, this is not the case by the letter of the architecture and, in fact, the accesses to A and C are permitted to pass their nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the store-release on B). This is a clear violation of the full barrier requirement. The simple way to fix this is to implement the same algorithm as ARMv7 using explicit barriers: <Access [A]> // atomic_op (B) dmb ish // Full barrier 1: ldxr x0, [B] // Exclusive load <op(B)> stxr w1, x0, [B] // Exclusive store cbnz w1, 1b dmb ish // Full barrier <Access [C]> but this has the undesirable effect of introducing two full barrier instructions. A better approach is actually the following, non-intuitive sequence: <Access [A]> // atomic_op (B) 1: ldxr x0, [B] // Exclusive load <op(B)> stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b dmb ish // Full barrier <Access [C]> The simple observations here are: - The dmb ensures that no subsequent accesses (e.g. the access to C) can enter or pass the atomic sequence. - The dmb also ensures that no prior accesses (e.g. the access to A) can pass the atomic sequence. - Therefore, no prior access can pass a subsequent access, or vice-versa (i.e. A is strictly ordered before C). - The stlxr ensures that no prior access can pass the store component of the atomic operation. The only tricky part remaining is the ordering between the ldxr and the access to A, since the absence of the first dmb means that we're now permitting re-ordering between the ldxr and any prior accesses. From an (arbitrary) observer's point of view, there are two scenarios: 1. We have observed the ldxr. This means that if we perform a store to [B], the ldxr will still return older data. If we can observe the ldxr, then we can potentially observe the permitted re-ordering with the access to A, which is clearly an issue when compared to the dmb variant of the code. Thankfully, the exclusive monitor will save us here since it will be cleared as a result of the store and the ldxr will retry. Notice that any use of a later memory observation to imply observation of the ldxr will also imply observation of the access to A, since the stlxr/dmb ensure strict ordering. 2. We have not observed the ldxr. This means we can perform a store and influence the later ldxr. However, that doesn't actually tell us anything about the access to [A], so we've not lost anything here either when compared to the dmb variant. This patch implements this solution for our barriered atomic operations, ensuring that we satisfy the full barrier requirements where they are needed. Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-07 16:45:43 +00:00
Nathan Lynch	d4022a3352	arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE Update wall-to-monotonic fields in the VDSO data page unconditionally. These are used to service CLOCK_MONOTONIC_COARSE, which is not guarded by use_syscall. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-05 11:55:49 +00:00
Nathan Lynch	069b918623	arm64: vdso: fix coarse clock handling When __kernel_clock_gettime is called with a CLOCK_MONOTONIC_COARSE or CLOCK_REALTIME_COARSE clock id, it returns incorrectly to whatever the caller has placed in x2 ("ret x2" to return from the fast path). Fix this by saving x30/LR to x2 only in code that will call __do_get_tspec, restoring x30 afterward, and using a plain "ret" to return from the routine. Also: while the resulting tv_nsec value for CLOCK_REALTIME and CLOCK_MONOTONIC must be computed using intermediate values that are left-shifted by cs_shift (x12, set by __do_get_tspec), the results for coarse clocks should be calculated using unshifted values (xtime_coarse_nsec is in units of actual nanoseconds). The current code shifts intermediate values by x12 unconditionally, but x12 is uninitialized when servicing a coarse clock. Fix this by setting x12 to 0 once we know we are dealing with a coarse clock id. Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-05 11:55:30 +00:00
Will Deacon	4050740348	arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k Whilst the text segment for our VDSO is marked as PT_LOAD in the ELF headers, it is mapped by the kernel and not actually subject to demand-paging. ld doesn't realise this, and emits a p_align field of 64k (the maximum supported page size), which conflicts with the load address picked by the kernel on 4k systems, which will be 4k aligned. This causes GDB to fail with "Failed to read a valid object file image from memory" when attempting to load the VDSO. This patch passes the -n option to ld, which prevents it from aligning PT_LOAD segments to the maximum page size. Cc: <stable@vger.kernel.org> Reported-by: Kyle McMartin <kyle@redhat.com> Acked-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-02-04 17:52:47 +00:00
Linus Torvalds	deb2a1d29b	Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pyll ARM64 patches from Catalin Marinas: - Build fix with DMA_CMA enabled - Introduction of PTE_WRITE to distinguish between writable but clean and truly read-only pages - FIQs enabling/disabling clean-up (they aren't used on arm64) - CPU resume fix for the per-cpu offset restoring - Code comment typos * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: Introduce PTE_WRITE arm64: mm: Remove PTE_BIT_FUNC macro arm64: FIQs are unused arm64: mm: fix the function name in comment of cpu_do_switch_mm arm64: fix build error if DMA_CMA is enabled arm64: kernel: fix per-cpu offset restore on resume arm64: mm: fix the function name in comment of __flush_dcache_area arm64: mm: use ubfm for dcache_line_size	2014-01-31 14:25:52 -08:00
Nicolas Pitre	f864b61ee4	arm64: FIQs are unused So any FIQ handling is superfluous at the moment. The functions to disable/enable FIQs is kept around if ever someone needs them in the future, but existing calling sites including arch_cpu_idle_prepare() may go for now. Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-30 13:51:43 +00:00
Lorenzo Pieralisi	fb4a96029c	arm64: kernel: fix per-cpu offset restore on resume The introduction of percpu offset optimisation through tpidr_el1 in: Commit id :7158627686f02319c50c8d9d78f75d4c8 "arm64: percpu: implement optimised pcpu access using tpidr_el1" requires cpu_{suspend/resume} to restore the tpidr_el1 register upon resume so that percpu variables can be addressed correctly when a CPU comes out of reset from warm-boot. This patch fixes cpu_{suspend}/{resume} tpidr_el1 restoration on resume, by calling the set_my_cpu_offset C API, as it is done on primary and secondary CPUs on cold boot, so that, even if the register used to store the percpu offset is changed, the save and restore of general purpose registers does not have to be updated. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-24 14:27:40 +00:00
Linus Torvalds	82b51734b4	Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull ARM64 updates from Catalin Marinas: - CPU suspend support on top of PSCI (firmware Power State Coordination Interface) - jump label support - CMA can now be enabled on arm64 - HWCAP bits for crypto and CRC32 extensions - optimised percpu using tpidr_el1 register - code cleanup * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (42 commits) arm64: fix typo in entry.S arm64: kernel: restore HW breakpoint registers in cpu_suspend jump_label: use defined macros instead of hard-coding for better readability arm64, jump label: optimize jump label implementation arm64, jump label: detect %c support for ARM64 arm64: introduce aarch64_insn_gen_{nop\|branch_imm}() helper functions arm64: move encode_insn_immediate() from module.c to insn.c arm64: introduce interfaces to hotpatch kernel and module code arm64: introduce basic aarch64 instruction decoding helpers arm64: dts: Reduce size of virtio block device for foundation model arm64: Remove unused __data_loc variable arm64: Enable CMA arm64: Warn on NULL device structure for dma APIs arm64: Add hwcaps for crypto and CRC32 extensions. arm64: drop redundant macros from read_cpuid() arm64: Remove outdated comment arm64: cmpxchg: update macros to prevent warnings arm64: support single-step and breakpoint handler hooks ARM64: fix framepointer check in unwind_frame ARM64: check stack pointer in get_wchan ...	2014-01-20 15:40:44 -08:00
Neil Zhang	883c057367	arm64: fix typo in entry.S Commit `64681787` (arm64: let the core code deal with preempt_count) changed the code, but left the comments unchanged, fix it. Signed-off-by: Neil Zhang <zhangwm@marvell.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-13 13:55:13 +00:00
Lorenzo Pieralisi	65c021bb49	arm64: kernel: restore HW breakpoint registers in cpu_suspend When a CPU resumes from low-power, it restores HW breakpoint and watchpoint slots through a CPU PM notifier. Since we want to enable debugging as early as possible in the resume path, the mdscr content is restored along the general purpose registers in the cpu_suspend API and debug exceptions are reenabled when cpu_suspend returns. Since the CPU PM notifier is run after a CPU has been resumed, we cannot expect HW breakpoint registers to contain sane values till the notifier is run, since the HW breakpoints registers content is unknown at reset; this means that the CPU might run with debug exceptions enabled, mdscr restored but HW breakpoint registers containing junk values that can trigger spurious debug exceptions. This patch fixes current HW breakpoints restore by moving the HW breakpoints registers restoration to the cpu_suspend API, before the debug exceptions are enabled. This way, as soon as the cpu_suspend function returns the kernel can resume debugging with sane values in HW breakpoint registers. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-10 17:51:35 +00:00
Jiang Liu	9732cafd9d	arm64, jump label: optimize jump label implementation Optimize jump label implementation for ARM64 by dynamically patching kernel text. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-08 15:23:53 +00:00
Jiang Liu	5c5bf25d4f	arm64: introduce aarch64_insn_gen_{nop\|branch_imm}() helper functions Introduce aarch64_insn_gen_{nop\|branch_imm}() helper functions, which will be used to implement jump label on ARM64. Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-08 15:21:29 +00:00
Jiang Liu	c84fced8d9	arm64: move encode_insn_immediate() from module.c to insn.c Function encode_insn_immediate() will be used by other instruction manipulate related functions, so move it into insn.c and rename it as aarch64_insn_encode_immediate(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-08 15:21:29 +00:00
Jiang Liu	ae16480785	arm64: introduce interfaces to hotpatch kernel and module code Introduce three interfaces to patch kernel and module code: aarch64_insn_patch_text_nosync(): patch code without synchronization, it's caller's responsibility to synchronize all CPUs if needed. aarch64_insn_patch_text_sync(): patch code and always synchronize with stop_machine() aarch64_insn_patch_text(): patch code and synchronize with stop_machine() if needed Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-08 15:21:29 +00:00
Jiang Liu	b11a64a48c	arm64: introduce basic aarch64 instruction decoding helpers Introduce basic aarch64 instruction decoding helper aarch64_get_insn_class() and aarch64_insn_hotpatch_safe(). Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2014-01-08 15:21:28 +00:00
Geoff Levand	b22cf637bb	arm64: Remove unused __data_loc variable The __data_loc variable is an unused left over from the 32 bit arm implementation. Remove that variable and adjust the __mmap_switched startup routine accordingly. Signed-off-by: Geoff Levand <geoff@infradead.org> for Huawei, Linaro Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-20 12:04:48 +00:00
Catalin Marinas	0a5be743e8	Merge tag 'arm64-suspend' of git://linux-arm.org/linux-2.6-lp into upstream * tag 'arm64-suspend' of git://linux-arm.org/linux-2.6-lp: arm64: add CPU power management menu/entries arm64: kernel: add PM build infrastructure arm64: kernel: add CPU idle call arm64: enable generic clockevent broadcast arm64: kernel: implement HW breakpoints CPU PM notifier arm64: kernel: refactor code to install/uninstall breakpoints arm: kvm: implement CPU PM notifier arm64: kernel: implement fpsimd CPU PM notifier arm64: kernel: cpu_{suspend/resume} implementation arm64: kernel: suspend/resume registers save/restore arm64: kernel: build MPIDR_EL1 hash function data structure arm64: kernel: add MPIDR_EL1 accessors macros Conflicts: arch/arm64/Kconfig	2013-12-19 17:57:51 +00:00
Steve Capper	4bff28ccda	arm64: Add hwcaps for crypto and CRC32 extensions. Advertise the optional cryptographic and CRC32 instructions to user space where present. Several hwcap bits [3-7] are allocated. Signed-off-by: Steve Capper <steve.capper@linaro.org> [bit 2 is taken now so use bits 3-7 instead] Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:44:08 +00:00
Liviu Dudau	81cac69944	arm64: Remove outdated comment Code referenced in the comment has moved to arch/arm64/kernel/cputable.c Signed-off-by: Liviu Dudau <Liviu.Dudau@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:44:06 +00:00
Sandeepa Prabhu	ee6214cec7	arm64: support single-step and breakpoint handler hooks AArch64 Single Steping and Breakpoint debug exceptions will be used by multiple debug framworks like kprobes & kgdb. This patch implements the hooks for those frameworks to register their own handlers for handling breakpoint and single step events. Reworked the debug exception handler in entry.S: do_dbg to route software breakpoint (BRK64) exception to do_debug_exception() Signed-off-by: Sandeepa Prabhu <sandeepa.prabhu@linaro.org> Signed-off-by: Deepak Saxena <dsaxena@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:43:11 +00:00
Konstantin Khlebnikov	26920dd2da	ARM64: fix framepointer check in unwind_frame We need at least 24 bytes above frame pointer. Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:43:10 +00:00
Konstantin Khlebnikov	408c3658b0	ARM64: check stack pointer in get_wchan get_wchan() is lockless. Task may wakeup at any time and change its own stack, thus each next stack frame may be overwritten and filled with random stuff. Signed-off-by: Konstantin Khlebnikov <k.khlebnikov@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:43:09 +00:00
Will Deacon	12a0ef7b0a	arm64: use generic strnlen_user and strncpy_from_user functions This patch implements the word-at-a-time interface for arm64 using the same algorithm as ARM. We use the fls64 macro, which expands to a clz instruction via a compiler builtin. Big-endian configurations make use of the implementation from asm-generic. With this implemented, we can replace our byte-at-a-time strnlen_user and strncpy_from_user functions with the optimised generic versions. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:43:06 +00:00
Will Deacon	7158627686	arm64: percpu: implement optimised pcpu access using tpidr_el1 This patch implements optimised percpu variable accesses using the el1 r/w thread register (tpidr_el1) along the same lines as arch/arm/. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2013-12-19 17:43:06 +00:00

1 2 3 4 5 ...

238 Commits