kernel

mirror of https://github.com/ukui/kernel.git synced 2026-03-09 10:07:04 -07:00

Author	SHA1	Message	Date
Mark Rutland	0072dc1b53	arm64: avoid BUILD_BUG_ON() in alternative-macros Nathan reports that the build fails when using clang and LTO: \| In file included from kernel/bounds.c:10: \| In file included from ./include/linux/page-flags.h:10: \| In file included from ./include/linux/bug.h:5: \| In file included from ./arch/arm64/include/asm/bug.h:26: \| In file included from ./include/asm-generic/bug.h:5: \| In file included from ./include/linux/compiler.h:248: \| In file included from ./arch/arm64/include/asm/rwonce.h:11: \| ./arch/arm64/include/asm/alternative-macros.h:224:2: error: call to undeclared function 'BUILD_BUG_ON'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] \| BUILD_BUG_ON(feature >= ARM64_NCAPS); \| ^ \| ./arch/arm64/include/asm/alternative-macros.h:241:2: error: call to undeclared function 'BUILD_BUG_ON'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] \| BUILD_BUG_ON(feature >= ARM64_NCAPS); \| ^ \| 2 errors generated. ... the problem being that when LTO is enabled, <asm/rwonce.h> includes <asm/alternative-macros.h>, and causes a circular include dependency through <linux/bug.h>. This manifests as BUILD_BUG_ON() not being defined when used within <asm/alternative-macros.h>. This patch avoids the problem and simplifies the include dependencies by using compiletime_assert() instead of BUILD_BUG_ON(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: `21fb26bfb0` ("arm64: alternatives: add alternative_has_feature_*()") Reported-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Link: http://lore.kernel.org/r/YyigTrxhE3IRPzjs@dev-arch.thelio-3990X Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220920140044.1709073-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-21 12:41:50 +01:00
Marc Zyngier	bb0cca240a	Merge branch kvm-arm64/single-step-async-exception into kvmarm-master/next * kvm-arm64/single-step-async-exception: : . : Single-step fixes from Reiji Watanabe: : : "This series fixes two bugs of single-step execution enabled by : userspace, and add a test case for KVM_GUESTDBG_SINGLESTEP to : the debug-exception test to verify the single-step behavior." : . KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP KVM: arm64: selftests: Refactor debug-exceptions to make it amenable to new test cases KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-09-19 10:59:29 +01:00
Reiji Watanabe	370531d1e9	KVM: arm64: Clear PSTATE.SS when the Software Step state was Active-pending While userspace enables single-step, if the Software Step state at the last guest exit was "Active-pending", clear PSTATE.SS on guest entry to restore the state. Currently, KVM sets PSTATE.SS to 1 on every guest entry while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP). It means KVM always makes the vCPU's Software Step state "Active-not-pending" on the guest entry, which lets the VCPU perform single-step (then Software Step exception is taken). This could cause extra single-step (without returning to userspace) if the Software Step state at the last guest exit was "Active-pending" (i.e. the last exit was triggered by an asynchronous exception after the single-step is performed, but before the Software Step exception is taken. See "Figure D2-3 Software step state machine" and "D2.12.7 Behavior in the active-pending state" in ARM DDI 0487I.a for more info about this behavior). Fix this by clearing PSTATE.SS on guest entry if the Software Step state at the last exit was "Active-pending" so that KVM restore the state (and the exception is taken before further single-step is performed). Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-3-reijiw@google.com	2022-09-19 10:48:53 +01:00
Reiji Watanabe	34fbdee086	KVM: arm64: Preserve PSTATE.SS for the guest while single-step is enabled Preserve the PSTATE.SS value for the guest while userspace enables single-step (i.e. while KVM manipulates the PSTATE.SS) for the vCPU. Currently, while userspace enables single-step for the vCPU (with KVM_GUESTDBG_SINGLESTEP), KVM sets PSTATE.SS to 1 on every guest entry, not saving its original value. When userspace disables single-step, KVM doesn't restore the original value for the subsequent guest entry (use the current value instead). Exception return instructions copy PSTATE.SS from SPSR_ELx.SS only in certain cases when single-step is enabled (and set it to 0 in other cases). So, the value matters only when the guest enables single-step (and when the guest's Software step state isn't affected by single-step enabled by userspace, practically), though. Fix this by preserving the original PSTATE.SS value while userspace enables single-step, and restoring the value once it is disabled. This fix modifies the behavior of GET_ONE_REG/SET_ONE_REG for the PSTATE.SS while single-step is enabled by userspace. Presently, GET_ONE_REG/SET_ONE_REG gets/sets the current PSTATE.SS value, which KVM will override on the next guest entry (i.e. the value userspace gets/sets is not used for the next guest entry). With this patch, GET_ONE_REG/SET_ONE_REG will get/set the guest's preserved value, which KVM will preserve and try to restore after single-step is disabled. Fixes: `337b99bf7e` ("KVM: arm64: guest debug, add support for single-step") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220917010600.532642-2-reijiw@google.com	2022-09-19 10:48:53 +01:00
Mark Rutland	d926079f17	arm64: alternatives: add shared NOP callback For each instance of an alternative, the compiler outputs a distinct copy of the alternative instructions into a subsection. As the compiler doesn't have special knowledge of alternatives, it cannot coalesce these to save space. In a defconfig kernel built with GCC 12.1.0, there are approximately 10,000 instances of alternative_has_feature_likely(), where the replacement instruction is always a NOP. As NOPs are position-independent, we don't need a unique copy per alternative sequence. This patch adds a callback to patch an alternative sequence with NOPs, and make use of this in alternative_has_feature_likely(). So that this can be used for other sites in future, this is written to patch multiple instructions up to the original sequence length. For NVHE, an alias is added to image-vars.h. For modules, the callback is exported. Note that as modules are loaded within 2GiB of the kernel, an alt_instr entry in a module can always refer directly to the callback, and no special handling is necessary. When building with GCC 12.1.0, the vmlinux is ~158KiB smaller, though the resulting Image size is unchanged due to alignment constraints and padding: \| % ls -al vmlinux-* \| -rwxr-xr-x 1 mark mark 134644592 Sep 1 14:52 vmlinux-after \| -rwxr-xr-x 1 mark mark 134486232 Sep 1 14:50 vmlinux-before \| % ls -al Image-* \| -rw-r--r-- 1 mark mark 37108224 Sep 1 14:52 Image-after \| -rw-r--r-- 1 mark mark 37108224 Sep 1 14:50 Image-before Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:03 +01:00
Mark Rutland	21fb26bfb0	arm64: alternatives: add alternative_has_feature_() Currrently we use a mixture of alternative sequences and static branches to handle features detected at boot time. For ease of maintenance we generally prefer to use static branches in C code, but this has a few downsides: Each static branch has metadata in the __jump_table section, which is not discarded after features are finalized. This wastes some space, and slows down the patching of other static branches. * The static branches are patched at a different point in time from the alternatives, so changes are not atomic. This leaves a transient period where there could be a mismatch between the behaviour of alternatives and static branches, which could be problematic for some features (e.g. pseudo-NMI). * More (instrumentable) kernel code is executed to patch each static branch, which can be risky when patching certain features (e.g. irqflags management for pseudo-NMI). * When CONFIG_JUMP_LABEL=n, static branches are turned into a load of a flag and a conditional branch. This means it isn't safe to use such static branches in an alternative address space (e.g. the NVHE/PKVM hyp code), where the generated address isn't safe to acccess. To deal with these issues, this patch introduces new alternative_has_feature_() helpers, which work like static branches but are patched using alternatives. This ensures the patching is performed at the same time as other alternative patching, allows the metadata to be freed after patching, and is safe for use in alternative address spaces. Note that all supported toolchains have asm goto support, and since commit: `a0a12c3ed0` ("asm goto: eradicate CC_HAS_ASM_GOTO)" ... the CC_HAS_ASM_GOTO Kconfig symbol has been removed, so no feature check is necessary, and we can always make use of asm goto. Additionally, note that: This has no impact on cpus_have_cap(), which is a dynamic check. * This has no functional impact on cpus_have_const_cap(). The branches are patched slightly later than before this patch, but these branches are not reachable until caps have been finalised. * It is now invalid to use cpus_have_final_cap() in the window between feature detection and patching. All existing uses are only expected after patching anyway, so this should not be a problem. * The LSE atomics will now be enabled during alternatives patching rather than immediately before. As the LL/SC an LSE atomics are functionally equivalent this should not be problematic. When building defconfig with GCC 12.1.0, the resulting Image is 64KiB smaller: \| % ls -al Image-* \| -rw-r--r-- 1 mark mark 37108224 Aug 23 09:56 Image-after \| -rw-r--r-- 1 mark mark 37173760 Aug 23 09:54 Image-before According to bloat-o-meter.pl: \| add/remove: 44/34 grow/shrink: 602/1294 up/down: 39692/-61108 (-21416) \| Function old new delta \| [...] \| Total: Before=16618336, After=16596920, chg -0.13% \| add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-1296 (-1296) \| Data old new delta \| arm64_const_caps_ready 16 - -16 \| cpu_hwcap_keys 1280 - -1280 \| Total: Before=8987120, After=8985824, chg -0.01% \| add/remove: 0/0 grow/shrink: 0/0 up/down: 0/0 (0) \| RO Data old new delta \| Total: Before=18408, After=18408, chg +0.00% Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:03 +01:00
Mark Rutland	4c0bd995d7	arm64: alternatives: have callbacks take a cap Today, callback alternatives are special-cased within __apply_alternatives(), and are applied alongside patching for system capabilities as ARM64_NCAPS is not part of the boot_capabilities feature mask. This special-casing is less than ideal. Giving special meaning to ARM64_NCAPS for this requires some structures and loops to use ARM64_NCAPS + 1 (AKA ARM64_NPATCHABLE), while others use ARM64_NCAPS. It's also not immediately clear callback alternatives are only applied when applying alternatives for system-wide features. To make this a bit clearer, changes the way that callback alternatives are identified to remove the special-casing of ARM64_NCAPS, and to allow callback alternatives to be associated with a cpucap as with all other alternatives. New cpucaps, ARM64_ALWAYS_BOOT and ARM64_ALWAYS_SYSTEM are added which are always detected alongside boot cpu capabilities and system capabilities respectively. All existing callback alternatives are made to use ARM64_ALWAYS_SYSTEM, and so will be patched at the same point during the boot flow as before. Subsequent patches will make more use of these new cpucaps. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:03 +01:00
Mark Rutland	92b4b5619f	arm64: cpufeature: make cpus_have_cap() noinstr-safe Currently it isn't safe to use cpus_have_cap() from noinstr code as test_bit() is explicitly instrumented, and were cpus_have_cap() placed out-of-line, cpus_have_cap() itself could be instrumented. Make cpus_have_cap() noinstr safe by marking it __always_inline and using arch_test_bit(). Aside from the prevention of instrumentation, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220912162210.3626215-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 17:15:02 +01:00
James Morse	445c953e4a	arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c get_arm64_ftr_reg() returns the properties of a system register based on its instruction encoding. This is needed by erratum workaround in cpu_errata.c to modify the user-space visible view of id registers. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/20220909165938.3931307-3-james.morse@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 15:16:34 +01:00
Mark Brown	10453bf149	arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation Convert ID_AA64AFRn_EL1 to automatic generation as per DDI0487I.a, no functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:58 +01:00
Mark Brown	c65c617806	arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation Convert ID_AA64FDR1_EL1 to automatic generation as per DDI0487I.a, no functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:58 +01:00
Mark Brown	e62a2d2610	arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation Convert ID_AA64DFR0_EL1 to automatic generation as per DDI0487I.a, no functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:58 +01:00
Mark Brown	121a8fc088	arm64/sysreg: Use feature numbering for PMU and SPE revisions Currently the kernel refers to the versions of the PMU and SPE features by the version of the architecture where those features were updated but the ARM refers to them using the FEAT_ names for the features. To improve consistency and help with updating for newer features and since v9 will make our current naming scheme a bit more confusing update the macros identfying features to use the FEAT_ based scheme. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	fcf37b38ff	arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names Normally we include the full register name in the defines for fields within registers but this has not been followed for ID registers. In preparation for automatic generation of defines add the _EL1s into the defines for ID_AA64DFR0_EL1 to follow the convention. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Brown	c0357a73fa	arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture The naming scheme the architecture uses for the fields in ID_AA64DFR0_EL1 does not align well with kernel conventions, using as it does a lot of MixedCase in various arrangements. In preparation for automatically generating the defines for this register rename the defines used to match what is in the architecture. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220910163354.860255-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:38:57 +01:00
Mark Rutland	830a2a4d85	arm64: rework BTI exception handling If a BTI exception is taken from EL1, the entry code will treat this as an unhandled exception and will panic() the kernel. This is inconsistent with the way we handle FPAC exceptions, which have a dedicated handler and only necessarily kill the thread from which the exception was taken from, and we don't log all the information that could be relevant to debug the issue. The code in do_bti() has: BUG_ON(!user_mode(regs)); ... and it seems like the intent was to call this for EL1 BTI exceptions, as with FPAC, but this was omitted due to an oversight. This patch adds separate EL0 and EL1 BTI exception handlers, with the latter calling die() directly to report the original context the BTI exception was taken from. This matches our handling of FPAC exceptions. Prior to this patch, a BTI failure is reported as: \| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x0000000034000002 -- BTI \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9 \| Hardware name: linux,dummy-virt (DT) \| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c) \| pc : test_bti_callee+0x4/0x10 \| lr : test_bti_caller+0x1c/0x28 \| sp : ffff80000800bdf0 \| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83 \| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378 \| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff8000080257e4 \| Kernel panic - not syncing: Unhandled exception \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00131-g7d937ff0221d-dirty #9 \| Hardware name: linux,dummy-virt (DT) \| Call trace: \| dump_backtrace.part.0+0xcc/0xe0 \| show_stack+0x18/0x5c \| dump_stack_lvl+0x64/0x80 \| dump_stack+0x18/0x34 \| panic+0x170/0x360 \| arm64_exit_nmi.isra.0+0x0/0x80 \| el1h_64_sync_handler+0x64/0xd0 \| el1h_64_sync+0x64/0x68 \| test_bti_callee+0x4/0x10 \| smp_cpus_done+0xb0/0xbc \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 With this patch applied, a BTI failure is reported as: \| Internal error: Oops - BTI: 0000000034000002 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g0ad98265d582-dirty #8 \| Hardware name: linux,dummy-virt (DT) \| pstate: 20400809 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=-c) \| pc : test_bti_callee+0x4/0x10 \| lr : test_bti_caller+0x1c/0x28 \| sp : ffff80000800bdf0 \| x29: ffff80000800bdf0 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2b8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff800009ff7000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000041a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000040000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000f83 \| x5 : ffff80000a2b6000 x4 : ffff0000028d0000 x3 : ffff800009f78378 \| x2 : 0000000000000000 x1 : 0000000040210000 x0 : ffff800008025804 \| Call trace: \| test_bti_callee+0x4/0x10 \| smp_cpus_done+0xb0/0xbc \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: d50323bf d53cd040 d65f03c0 d503233f (d50323bf) Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:17:03 +01:00
Mark Rutland	a1fafa3b24	arm64: rework FPAC exception handling If an FPAC exception is taken from EL1, the entry code will call do_ptrauth_fault(), where due to: BUG_ON(!user_mode(regs)) ... the kernel will report a problem within do_ptrauth_fault() rather than reporting the original context the FPAC exception was taken from. The pt_regs and ESR value reported will be from within do_ptrauth_fault() and the code dump will be for the BRK in BUG_ON(), which isn't sufficient to debug the cause of the original exception. This patch makes the reporting better by having separate EL0 and EL1 FPAC exception handlers, with the latter calling die() directly to report the original context the FPAC exception was taken from. Note that we only need to prevent kprobes of the EL1 FPAC handler, since the EL0 FPAC handler cannot be called recursively. For consistency with do_el0_svc*(), I've named the split functions do_el{0,1}_fpac() rather than do_el{0,1}_ptrauth_fault(). I've also clarified the comment to not imply there are casues other than FPAC exceptions. Prior to this patch FPAC exceptions are reported as: \| kernel BUG at arch/arm64/kernel/traps.c:517! \| Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00130-g9c8a180a1cdf-dirty #12 \| Hardware name: FVP Base RevC (DT) \| pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : do_ptrauth_fault+0x3c/0x40 \| lr : el1_fpac+0x34/0x54 \| sp : ffff80000a3bbc80 \| x29: ffff80000a3bbc80 x28: ffff0008001d8000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: 0000000020400009 x22: ffff800008f70fa4 x21: ffff80000a3bbe00 \| x20: 0000000072000000 x19: ffff80000a3bbcb0 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 \| x5 : ffff80000a3bbcb0 x4 : ffff0008001d8000 x3 : 0000000072000000 \| x2 : 0000000000000000 x1 : 0000000020400009 x0 : ffff80000a3bbcb0 \| Call trace: \| do_ptrauth_fault+0x3c/0x40 \| el1h_64_sync_handler+0xc4/0xd0 \| el1h_64_sync+0x64/0x68 \| test_pac+0x8/0x10 \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: 97fffe5e a8c17bfd d50323bf d65f03c0 (d4210000) With this patch applied FPAC exceptions are reported as: \| Internal error: Oops - FPAC: 0000000072000000 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.19.0-rc3-00132-g78846e1c4757-dirty #11 \| Hardware name: FVP Base RevC (DT) \| pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : test_pac+0x8/0x10 \| lr : 0x0 \| sp : ffff80000a3bbe00 \| x29: ffff80000a3bbe00 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: ffff80000a2c8000 x22: 0000000000000000 x21: 0000000000000000 \| x20: ffff8000099fa5b0 x19: ffff80000a007000 x18: fffffbfffda37000 \| x17: 3120676e696d7573 x16: 7361202c6e6f6974 x15: 0000000081a90000 \| x14: 0040000000000041 x13: 0040000000000001 x12: ffff000001a90000 \| x11: fffffbfffda37480 x10: 0068000000000703 x9 : 0001000080000000 \| x8 : 0000000000090000 x7 : 0068000000000f03 x6 : 0060000000000783 \| x5 : ffff80000a2c6000 x4 : ffff0008001d8000 x3 : ffff800009f88378 \| x2 : 0000000000000000 x1 : 0000000080210000 x0 : ffff000001a90000 \| Call trace: \| test_pac+0x8/0x10 \| smp_init+0x7c/0x8c \| kernel_init_freeable+0x128/0x28c \| kernel_init+0x28/0x13c \| ret_from_fork+0x10/0x20 \| Code: d50323bf d65f03c0 d503233f aa1f03fe (d50323bf) Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:17:03 +01:00
Mark Rutland	0f2cb928a1	arm64: consistently pass ESR_ELx to die() Currently, bug_handler() and kasan_handler() call die() with '0' as the 'err' value, whereas die_kernel_fault() passes the ESR_ELx value. For consistency, this patch ensures we always pass the ESR_ELx value to die(). As this is only called for exceptions taken from kernel mode, there should be no user-visible change as a result of this patch. For UNDEFINED exceptions, I've had to modify do_undefinstr() and its callers to pass the ESR_ELx value. In all cases the ESR_ELx value had already been read and was available. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:17:03 +01:00
Mark Rutland	18906ff9af	arm64: die(): pass 'err' as long Recently, we reworked a lot of code to consistentlt pass ESR_ELx as a 64-bit quantity. However, we missed that this can be passed into die() and __die() as the 'err' parameter where it is truncated to a 32-bit int. As notify_die() already takes 'err' as a long, this patch changes die() and __die() to also take 'err' as a long, ensuring that the full value of ESR_ELx is retained. At the same time, die() is updated to consistently log 'err' as a zero-padded 64-bit quantity. Subsequent patches will pass the ESR_ELx value to die() for a number of exceptions. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Amit Daniel Kachhap <amit.kachhap@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220913101732.3925290-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-16 12:17:03 +01:00
Kefeng Wang	2be9880dc8	kernel: exit: cleanup release_thread() Only x86 has own release_thread(), introduce a new weak release_thread() function to clean empty definitions in other ARCHs. Link: https://lkml.kernel.org/r/20220819014406.32266-1-wangkefeng.wang@huawei.com Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Brian Cain <bcain@quicinc.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64] Acked-by: Huacai Chen <chenhuacai@kernel.org> [LoongArch] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> [csky] Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jonas Bonn <jonas@southpole.se> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Xuerui Wang <kernel@xen0n.name> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-09-11 21:55:07 -07:00
Mark Rutland	78f6f5c994	arm64: atomic: always inline the assembly The __lse_() and __ll_sc_() atomic implementations are marked as inline rather than __always_inline, permitting a compiler to generate out-of-line versions, which may be instrumented. We marked the atomic wrappers as __always_inline in commit: `c35a824c31` ("arm64: make atomic helpers __always_inline") ... but did not think to do the same for the underlying implementations. If the compiler were to out-of-line an LSE or LL/SC atomic, this could break noinstr code. Ensure this doesn't happen by marking the underlying implementations as __always_inline. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220817155914.3975112-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 13:58:33 +01:00
Mark Rutland	b2c3ccbd00	arm64: atomics: remove LL/SC trampolines When CONFIG_ARM64_LSE_ATOMICS=y, each use of an LL/SC atomic results in a fragment of code being generated in a subsection without a clear association with its caller. A trampoline in the caller branches to the LL/SC atomic with with a direct branch, and the atomic directly branches back into its trampoline. This breaks backtracing, as any PC within the out-of-line fragment will be symbolized as an offset from the nearest prior symbol (which may not be the function using the atomic), and since the atomic returns with a direct branch, the caller's PC may be missing from the backtrace. For example, with secondary_start_kernel() hacked to contain atomic_inc(NULL), the resulting exception can be reported as being taken from cpus_are_stuck_in_kernel(): \| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 \| Mem abort info: \| ESR = 0x0000000096000004 \| EC = 0x25: DABT (current EL), IL = 32 bits \| SET = 0, FnV = 0 \| EA = 0, S1PTW = 0 \| FSC = 0x04: level 0 translation fault \| Data abort info: \| ISV = 0, ISS = 0x00000004 \| CM = 0, WnR = 0 \| [0000000000000000] user address but active_mm is swapper \| Internal error: Oops: 96000004 [#1] PREEMPT SMP \| Modules linked in: \| CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.19.0-11219-geb555cb5b794-dirty #3 \| Hardware name: linux,dummy-virt (DT) \| pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) \| pc : cpus_are_stuck_in_kernel+0xa4/0x120 \| lr : secondary_start_kernel+0x164/0x170 \| sp : ffff80000a4cbe90 \| x29: ffff80000a4cbe90 x28: 0000000000000000 x27: 0000000000000000 \| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 \| x23: 0000000000000000 x22: 0000000000000000 x21: 0000000000000000 \| x20: 0000000000000001 x19: 0000000000000001 x18: 0000000000000008 \| x17: 3030383832343030 x16: 3030303030307830 x15: ffff80000a4cbab0 \| x14: 0000000000000001 x13: 5d31666130663133 x12: 3478305b20313030 \| x11: 3030303030303078 x10: 3020726f73736563 x9 : 726f737365636f72 \| x8 : ffff800009ff2ef0 x7 : 0000000000000003 x6 : 0000000000000000 \| x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000100 \| x2 : 0000000000000000 x1 : ffff0000029bd880 x0 : 0000000000000000 \| Call trace: \| cpus_are_stuck_in_kernel+0xa4/0x120 \| __secondary_switched+0xb0/0xb4 \| Code: 35ffffa3 17fffc6c d53cd040 f9800011 (885f7c01) \| ---[ end trace 0000000000000000 ]--- This is confusing and hinders debugging, and will be problematic for CONFIG_LIVEPATCH as these cases cannot be unwound reliably. This is very similar to recent issues with out-of-line exception fixups, which were removed in commits: `35d67794b8` ("arm64: lib: __arch_clear_user(): fold fixups into body") `4012e0e227` ("arm64: lib: __arch_copy_from_user(): fold fixups into body") `139f9ab73d` ("arm64: lib: __arch_copy_to_user(): fold fixups into body") When the trampolines were introduced in commit: `addfc38672` ("arm64: atomics: avoid out-of-line ll/sc atomics") The rationale was to improve icache performance by grouping the LL/SC atomics together. This has never been measured, and this theoretical benefit is outweighed by other factors: * As the subsections are collapsed into sections at object file granularity, these are spread out throughout the kernel and can share cachelines with unrelated code regardless. * GCC 12.1.0 has been observed to place the trampoline out-of-line in specialised __ll_sc_() functions, introducing more branching than was intended. Removing the trampolines has been observed to shrink a defconfig kernel Image by 64KiB when building with GCC 12.1.0. This patch removes the LL/SC trampolines, meaning that the LL/SC atomics will be inlined into their callers (or placed in out-of line functions using regular BL/RET pairs). When CONFIG_ARM64_LSE_ATOMICS=y, the LL/SC atomics are always called in an unlikely branch, and will be placed in a cold portion of the function, so this should have minimal impact to the hot paths. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220817155914.3975112-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 13:58:28 +01:00
Mark Rutland	4b5e694e25	arm64: stacktrace: track hyp stacks in unwinder's address space Currently unwind_next_frame_record() has an optional callback to convert the address space of the FP. This is necessary for the NVHE unwinder, which tracks the stacks in the hyp VA space, but accesses the frame records in the kernel VA space. This is a bit unfortunate since it clutters unwind_next_frame_record(), which will get in the way of future rework. Instead, this patch changes the NVHE unwinder to track the stacks in the kernel's VA space and translate to FP prior to calling unwind_next_frame_record(). This removes the need for the translate_fp() callback, as all unwinders consistently track stacks in the native address space of the unwinder. At the same time, this patch consolidates the generation of the stack addresses behind the stackinfo_get_*() helpers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	8df137300d	arm64: stacktrace: track all stack boundaries explicitly Currently we call an on_accessible_stack() callback for each step of the unwinder, requiring redundant work to be performed in the core of the unwind loop (e.g. disabling preemption around accesses to per-cpu variables containing stack boundaries). To prevent unwind loops which go through a stack multiple times, we have to track the set of unwound stacks, requiring a stack_type enum which needs to cater for all the stacks of all possible callees. To prevent loops within a stack, we must track the prior FP values. This patch reworks the unwinder to minimize the work in the core of the unwinder, and to remove the need for the stack_type enum. The set of accessible stacks (and their boundaries) are determined at the start of the unwind, and the current stack is tracked during the unwind, with completed stacks removed from the set of accessible stacks. This makes the boundary checks more accurate (e.g. detecting overlapped frame records), and removes the need for separate tracking of the prior FP and visited stacks. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-9-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00
Mark Rutland	bd8abd6883	arm64: stacktrace: remove stack type from fp translator In subsequent patches we'll remove the stack_type enum, and move the FP translation logic out of the raw FP unwind code. In preparation for doing so, this patch removes the type parameter from the FP translation callback, and modifies kvm_nvhe_stack_kern_va() to determine the relevant stack directly. So that kvm_nvhe_stack_kern_va() can use the stackinfo_*() helpers, these are moved earlier in the file, but are not modified in any way. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-8-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-09-09 12:30:08 +01:00

1 2 3 4 5 ...

4506 Commits