linux

mirror of https://github.com/Dasharo/linux.git synced 2026-03-06 15:25:10 -08:00

Author	SHA1	Message	Date
Ard Biesheuvel	c3cb6c158c	objtool: Allow arch code to discover jump table size In preparation for adding support for annotated jump tables, where ELF relocations and symbols are used to describe the locations of jump tables in the executable, refactor the jump table discovery logic so the table size can be returned from arch_find_switch_table(). Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241011170847.334429-12-ardb+git@google.com	2024-12-02 12:24:29 +01:00
Josh Poimboeuf	ed1cb76ebd	objtool: Detect non-relocated text references When kernel IBT is enabled, objtool detects all text references in order to determine which functions can be indirectly branched to. In text, such references look like one of the following: mov $0x0,%rax R_X86_64_32S .init.text+0x7e0a0 lea 0x0(%rip),%rax R_X86_64_PC32 autoremove_wake_function-0x4 Either way the function pointer is denoted by a relocation, so objtool just reads that. However there are some "lea xxx(%rip)" cases which don't use relocations because they're referencing code in the same translation unit. Objtool doesn't have visibility to those. The only currently known instances of that are a few hand-coded asm text references which don't actually need ENDBR. So it's not actually a problem at the moment. However if we enable -fpie, the compiler would start generating them and there would definitely be bugs in the IBT sealing. Detect non-relocated text references and handle them appropriately. [ Note: I removed the manual static_call_tramp check -- that should already be handled by the noendbr check. ] Reported-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>	2024-10-17 15:13:06 -07:00
Tiezhu Yang	da5b2ad1c2	objtool: Handle frame pointer related instructions After commit `a0f7085f6a` ("LoongArch: Add RANDOMIZE_KSTACK_OFFSET support"), there are three new instructions "addi.d $fp, $sp, 32", "sub.d $sp, $sp, $t0" and "addi.d $sp, $fp, -32" for the secondary stack in do_syscall(), then there is a objtool warning "return with modified stack frame" and no handle_syscall() which is the previous frame of do_syscall() in the call trace when executing the command "echo l > /proc/sysrq-trigger". objdump shows something like this: 0000000000000000 <do_syscall>: 0: 02ff8063 addi.d $sp, $sp, -32 4: 29c04076 st.d $fp, $sp, 16 8: 29c02077 st.d $s0, $sp, 8 c: 29c06061 st.d $ra, $sp, 24 10: 02c08076 addi.d $fp, $sp, 32 ... 74: 0011b063 sub.d $sp, $sp, $t0 ... a8: 4c000181 jirl $ra, $t0, 0 ... dc: 02ff82c3 addi.d $sp, $fp, -32 e0: 28c06061 ld.d $ra, $sp, 24 e4: 28c04076 ld.d $fp, $sp, 16 e8: 28c02077 ld.d $s0, $sp, 8 ec: 02c08063 addi.d $sp, $sp, 32 f0: 4c000020 jirl $zero, $ra, 0 The instruction "sub.d $sp, $sp, $t0" changes the stack bottom and the new stack size is a random value, in order to find the return address of do_syscall() which is stored in the original stack frame after executing "jirl $ra, $t0, 0", it should use fp which points to the original stack top. At the beginning, the thought is tended to decode the secondary stack instruction "sub.d $sp, $sp, $t0" and set it as a label, then check this label for the two frame pointer instructions to change the cfa base and cfa offset during the period of secondary stack in update_cfi_state(). This is valid for GCC but invalid for Clang due to there are different secondary stack instructions for ClangBuiltLinux on LoongArch, something like this: 0000000000000000 <do_syscall>: ... 88: 00119064 sub.d $a0, $sp, $a0 8c: 00150083 or $sp, $a0, $zero ... Actually, it equals to a single instruction "sub.d $sp, $sp, $a0", but there is no proper condition to check it as a label like GCC, and so the beginning thought is not a good way. Essentially, there are two special frame pointer instructions which are "addi.d $fp, $sp, imm" and "addi.d $sp, $fp, imm", the first one points fp to the original stack top and the second one restores the original stack bottom from fp. Based on the above analysis, in order to avoid adding an arch-specific update_cfi_state(), we just add a member "frame_pointer" in the "struct symbol" as a label to avoid affecting the current normal case, then set it as true only if there is "addi.d $sp, $fp, imm". The last is to check this label for the two frame pointer instructions to change the cfa base and cfa offset in update_cfi_state(). Tested with the following two configs: (1) CONFIG_RANDOMIZE_KSTACK_OFFSET=y && CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=n (2) CONFIG_RANDOMIZE_KSTACK_OFFSET=y && CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT=y By the way, there is no effect for x86 with this patch, tested on the x86 machine with Fedora 40 system. Cc: stable@vger.kernel.org # 6.9+ Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-09-17 22:23:09 +08:00
Linus Torvalds	0c182ac2eb	Merge tag 'objtool-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - Fix bug that caused objtool to confuse certain memory ops added by KASAN instrumentation as stack accesses - Various faddr2line optimizations - Improve error messages * tag 'objtool-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool/x86: objtool can confuse memory and stack access objtool: Use "action" in error message to be consistent with help scripts/faddr2line: Check only two symbols when calculating symbol size scripts/faddr2line: Remove call to addr2line from find_dir_prefix() scripts/faddr2line: Invoke addr2line as a single long-running process scripts/faddr2line: Pass --addresses argument to addr2line scripts/faddr2line: Check vmlinux only once scripts/faddr2line: Combine three readelf calls into one scripts/faddr2line: Reduce number of readelf calls to three	2024-07-16 16:55:33 -07:00
Alexandre Chartre	8e366d83ed	objtool/x86: objtool can confuse memory and stack access The encoding of an x86 instruction can include a ModR/M and a SIB (Scale-Index-Base) byte to describe the addressing mode of the instruction. objtool processes all addressing mode with a SIB base of 5 as having %rbp as the base register. However, a SIB base of 5 means that the effective address has either no base (if ModR/M mod is zero) or %rbp as the base (if ModR/M mod is 1 or 2). This can cause objtool to confuse an absolute address access with a stack operation. For example, objtool will see the following instruction: 4c 8b 24 25 e0 ff ff mov 0xffffffffffffffe0,%r12 as a stack operation (i.e. similar to: mov -0x20(%rbp), %r12). [Note that this kind of weird absolute address access is added by the compiler when using KASAN.] If this perceived stack operation happens to reference the location where %r12 was pushed on the stack then the objtool validation will think that %r12 is being restored and this can cause a stack state mismatch. This kind behavior was seen on xfs code, after a minor change (convert kmem_alloc() to kmalloc()): >> fs/xfs/xfs.o: warning: objtool: xfs_da_grow_inode_int+0x6c1: stack state mismatch: reg1[12]=-2-48 reg2[12]=-1+0 Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202402220435.MGN0EV6l-lkp@intel.com/ Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Link: https://lore.kernel.org/r/20240620144747.2524805-1-alexandre.chartre@oracle.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>	2024-07-02 23:40:54 -07:00
Peter Zijlstra	d2a793dae2	x86/alternatives: Add nested alternatives macros Instead of making increasingly complicated ALTERNATIVE_n() implementations, use a nested alternative expression. The only difference between: ALTERNATIVE_2(oldinst, newinst1, flag1, newinst2, flag2) and ALTERNATIVE(ALTERNATIVE(oldinst, newinst1, flag1), newinst2, flag2) is that the outer alternative can add additional padding when the inner alternative is the shorter one, which then results in alt_instr::instrlen being inconsistent. However, this is easily remedied since the alt_instr entries will be consecutive and it is trivial to compute the max(alt_instr::instrlen) at runtime while patching. Specifically, after this the ALTERNATIVE_2 macro, after CPP expansion (and manual layout), looks like this: .macro ALTERNATIVE_2 oldinstr, newinstr1, ft_flags1, newinstr2, ft_flags2 740: 740: \oldinstr ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags1,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr1 ; 744: .popsection ; ; 741: .skip -(((744f-743f)-(741b-740b)) > 0) * ((744f-743f)-(741b-740b)),0x90 ; 742: .pushsection .altinstructions,"a" ; altinstr_entry 740b,743f,\ft_flags2,742b-740b,744f-743f ; .popsection ; .pushsection .altinstr_replacement,"ax" ; 743: \newinstr2 ; 744: .popsection ; .endm The only label that is ambiguous is 740, however they all reference the same spot, so that doesn't matter. NOTE: obviously only @oldinstr may be an alternative; making @newinstr an alternative would mean patching .altinstr_replacement which very likely isn't what is intended, also the labels will be confused in that case. [ bp: Debug an issue where it would match the wrong two insns and and consider them nested due to the same signed offsets in the .alternative section and use instr_va() to compare the full virtual addresses instead. - Use new labels to denote that the new, nested alternatives are being used when staring at preprocessed output. - Use the %c constraint everywhere instead of %P and document the difference for future reference. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230628104952.GA2439977@hirez.programming.kicks-ass.net	2024-06-11 17:13:08 +02:00
Linus Torvalds	1e3cd03c54	Merge tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Add objtool support for LoongArch - Add ORC stack unwinder support for LoongArch - Add kernel livepatching support for LoongArch - Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig - Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig - Some bug fixes and other small changes * tag 'loongarch-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch/crypto: Clean up useless assignment operations LoongArch: Define the __io_aw() hook as mmiowb() LoongArch: Remove superfluous flush_dcache_page() definition LoongArch: Move {dmw,tlb}_virt_to_page() definition to page.h LoongArch: Change __my_cpu_offset definition to avoid mis-optimization LoongArch: Select HAVE_ARCH_USERFAULTFD_MINOR in Kconfig LoongArch: Select ARCH_HAS_CURRENT_STACK_POINTER in Kconfig LoongArch: Add kernel livepatching support LoongArch: Add ORC stack unwinder support objtool: Check local label in read_unwind_hints() objtool: Check local label in add_dead_ends() objtool/LoongArch: Enable orc to be built objtool/x86: Separate arch-specific and generic parts objtool/LoongArch: Implement instruction decoder objtool/LoongArch: Enable objtool to be built	2024-03-22 10:22:45 -07:00
Linus Torvalds	685d982112	Merge tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: - The biggest change is the rework of the percpu code, to support the 'Named Address Spaces' GCC feature, by Uros Bizjak: - This allows C code to access GS and FS segment relative memory via variables declared with such attributes, which allows the compiler to better optimize those accesses than the previous inline assembly code. - The series also includes a number of micro-optimizations for various percpu access methods, plus a number of cleanups of %gs accesses in assembly code. - These changes have been exposed to linux-next testing for the last ~5 months, with no known regressions in this area. - Fix/clean up __switch_to()'s broken but accidentally working handling of FPU switching - which also generates better code - Propagate more RIP-relative addressing in assembly code, to generate slightly better code - Rework the CPU mitigations Kconfig space to be less idiosyncratic, to make it easier for distros to follow & maintain these options - Rework the x86 idle code to cure RCU violations and to clean up the logic - Clean up the vDSO Makefile logic - Misc cleanups and fixes * tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) x86/idle: Select idle routine only once x86/idle: Let prefer_mwait_c1_over_halt() return bool x86/idle: Cleanup idle_setup() x86/idle: Clean up idle selection x86/idle: Sanitize X86_BUG_AMD_E400 handling sched/idle: Conditionally handle tick broadcast in default_idle_call() x86: Increase brk randomness entropy for 64-bit systems x86/vdso: Move vDSO to mmap region x86/vdso/kbuild: Group non-standard build attributes and primary object file rules together x86/vdso: Fix rethunk patching for vdso-image-{32,64}.o x86/retpoline: Ensure default return thunk isn't used at runtime x86/vdso: Use CONFIG_COMPAT_32 to specify vdso32 x86/vdso: Use $(addprefix ) instead of $(foreach ) x86/vdso: Simplify obj-y addition x86/vdso: Consolidate targets and clean-files x86/bugs: Rename CONFIG_RETHUNK => CONFIG_MITIGATION_RETHUNK x86/bugs: Rename CONFIG_CPU_SRSO => CONFIG_MITIGATION_SRSO x86/bugs: Rename CONFIG_CPU_IBRS_ENTRY => CONFIG_MITIGATION_IBRS_ENTRY x86/bugs: Rename CONFIG_CPU_UNRET_ENTRY => CONFIG_MITIGATION_UNRET_ENTRY x86/bugs: Rename CONFIG_SLS => CONFIG_MITIGATION_SLS ...	2024-03-11 19:53:15 -07:00
Tiezhu Yang	3c7266cd7b	objtool/LoongArch: Enable orc to be built Implement arch-specific init_orc_entry(), write_orc_entry(), reg_name(), orc_type_name(), print_reg() and orc_print_dump(), then set BUILD_ORC as y to build the orc related files. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-03-11 22:23:47 +08:00
Tiezhu Yang	b8e85e6f3a	objtool/x86: Separate arch-specific and generic parts Move init_orc_entry(), write_orc_entry(), reg_name(), orc_type_name() and print_reg() from generic orc_gen.c and orc_dump.c to arch-specific orc.c, then introduce a new function orc_print_dump() to print info. This is preparation for later patch, no functionality change. Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-03-11 22:23:47 +08:00
Tiezhu Yang	b2d23158e6	objtool/LoongArch: Implement instruction decoder Only copy the minimal definitions of instruction opcodes and formats in inst.h from arch/loongarch to tools/arch/loongarch, and also copy the definition of sign_extend64() to tools/include/linux/bitops.h to decode the following kinds of instructions: (1) stack pointer related instructions addi.d, ld.d, st.d, ldptr.d and stptr.d (2) branch and jump related instructions beq, bne, blt, bge, bltu, bgeu, beqz, bnez, bceqz, bcnez, b, bl and jirl (3) other instructions break, nop and ertn See more info about instructions in LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-03-11 22:23:47 +08:00
Tiezhu Yang	e8aff71ca9	objtool/LoongArch: Enable objtool to be built Add the minimal changes to enable objtool build on LoongArch, most of the functions are stubs to only fix the build errors when make -C tools/objtool. This is similar with commit `e52ec98c5a` ("objtool/powerpc: Enable objtool to be built on ppc"). Co-developed-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Jinyang He <hejinyang@loongson.cn> Co-developed-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>	2024-03-11 22:23:46 +08:00
H. Peter Anvin (Intel)	cd19bab825	x86/objtool: Teach objtool about ERET[US] Update the objtool decoder to know about the ERET[US] instructions (type INSN_CONTEXT_SWITCH). Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Shan Kang <shan.kang@intel.com> Link: https://lore.kernel.org/r/20231205105030.8698-11-xin3.li@intel.com	2024-01-31 22:00:30 +01:00
Breno Leitao	aefb2f2e61	x86/bugs: Rename CONFIG_RETPOLINE => CONFIG_MITIGATION_RETPOLINE Step 5/10 of the namespace unification of CPU mitigations related Kconfig options. [ mingo: Converted a few more uses in comments/messages as well. ] Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ariel Miculas <amiculas@cisco.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20231121160740.1249350-6-leitao@debian.org	2024-01-10 10:52:28 +01:00
Ruan Jinjie	758a74306f	objtool: Use 'the fallthrough' pseudo-keyword Replace the existing /* fallthrough */ comments with the new 'fallthrough' pseudo-keyword macro: https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org	2023-10-03 21:37:35 +02:00
Peter Zijlstra	d025b7bac0	x86/cpu: Rename original retbleed methods Rename the original retbleed return thunk and untrain_ret to retbleed_return_thunk() and retbleed_untrain_ret(). No functional changes. Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230814121148.909378169@infradead.org	2023-08-16 21:47:53 +02:00
Peter Zijlstra	d43490d0ab	x86/cpu: Clean up SRSO return thunk mess Use the existing configurable return thunk. There is absolute no justification for having created this __x86_return_thunk alternative. To clarify, the whole thing looks like: Zen3/4 does: srso_alias_untrain_ret: nop2 lfence jmp srso_alias_return_thunk int3 srso_alias_safe_ret: // aliasses srso_alias_untrain_ret just so add $8, %rsp ret int3 srso_alias_return_thunk: call srso_alias_safe_ret ud2 While Zen1/2 does: srso_untrain_ret: movabs $foo, %rax lfence call srso_safe_ret (jmp srso_return_thunk ?) int3 srso_safe_ret: // embedded in movabs instruction add $8,%rsp ret int3 srso_return_thunk: call srso_safe_ret ud2 While retbleed does: zen_untrain_ret: test $0xcc, %bl lfence jmp zen_return_thunk int3 zen_return_thunk: // embedded in the test instruction ret int3 Where Zen1/2 flush the BTB entry using the instruction decoder trick (test,movabs) Zen3/4 use BTB aliasing. SRSO adds a return sequence (srso_safe_ret()) which forces the function return instruction to speculate into a trap (UD2). This RET will then mispredict and execution will continue at the return site read from the top of the stack. Pick one of three options at boot (evey function can only ever return once). [ bp: Fixup commit message uarch details and add them in a comment in the code too. Add a comment about the srso_select_mitigation() dependency on retbleed_select_mitigation(). Add moar ifdeffery for 32-bit builds. Add a dummy srso_untrain_ret_alias() definition for 32-bit alternatives needing the symbol. ] Fixes: `fb3bd914b3` ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230814121148.842775684@infradead.org	2023-08-16 21:47:24 +02:00
Peter Zijlstra	4ae68b26c3	objtool/x86: Fix SRSO mess Objtool --rethunk does two things: - it collects all (tail) call's of __x86_return_thunk and places them into .return_sites. These are typically compiler generated, but RET also emits this same. - it fudges the validation of the __x86_return_thunk symbol; because this symbol is inside another instruction, it can't actually find the instruction pointed to by the symbol offset and gets upset. Because these two things pertained to the same symbol, there was no pressing need to separate these two separate things. However, alas, along comes SRSO and more crazy things to deal with appeared. The SRSO patch itself added the following symbol names to identify as rethunk: 'srso_untrain_ret', 'srso_safe_ret' and '__ret' Where '__ret' is the old retbleed return thunk, 'srso_safe_ret' is a new similarly embedded return thunk, and 'srso_untrain_ret' is completely unrelated to anything the above does (and was only included because of that INT3 vs UD2 issue fixed previous). Clear things up by adding a second category for the embedded instruction thing. Fixes: `fb3bd914b3` ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230814121148.704502245@infradead.org	2023-08-16 09:39:16 +02:00
Borislav Petkov (AMD)	fb3bd914b3	x86/srso: Add a Speculative RAS Overflow mitigation Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>	2023-07-27 11:07:14 +02:00
Linus Torvalds	6f612579be	Merge tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molar: "Build footprint & performance improvements: - Reduce memory usage with CONFIG_DEBUG_INFO=y In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel, DWARF creates almost 200 million relocations, ballooning objtool's peak heap usage to 53GB. These patches reduce that to 25GB. On a distro-type kernel with kernel IBT enabled, they reduce objtool's peak heap usage from 4.2GB to 2.8GB. These changes also improve the runtime significantly. Debuggability improvements: - Add the unwind_debug command-line option, for more extend unwinding debugging output - Limit unreachable warnings to once per function - Add verbose option for disassembling affected functions - Include backtrace in verbose mode - Detect missing __noreturn annotations - Ignore exc_double_fault() __noreturn warnings - Remove superfluous global_noreturns entries - Move noreturn function list to separate file - Add __kunit_abort() to noreturns Unwinder improvements: - Allow stack operations in UNWIND_HINT_UNDEFINED regions - drm/vmwgfx: Add unwind hints around RBP clobber Cleanups: - Move the x86 entry thunk restore code into thunk functions - x86/unwind/orc: Use swap() instead of open coding it - Remove unnecessary/unused variables Fixes for modern stack canary handling" * tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits) x86/orc: Make the is_callthunk() definition depend on CONFIG_BPF_JIT=y objtool: Skip reading DWARF section data objtool: Free insns when done objtool: Get rid of reloc->rel[a] objtool: Shrink elf hash nodes objtool: Shrink reloc->sym_reloc_entry objtool: Get rid of reloc->jump_table_start objtool: Get rid of reloc->addend objtool: Get rid of reloc->type objtool: Get rid of reloc->offset objtool: Get rid of reloc->idx objtool: Get rid of reloc->list objtool: Allocate relocs in advance for new rela sections objtool: Add for_each_reloc() objtool: Don't free memory in elf_close() objtool: Keep GElf_Rel[a] structs synced objtool: Add elf_create_section_pair() objtool: Add mark_sec_changed() objtool: Fix reloc_hash size objtool: Consolidate rel/rela handling ...	2023-06-27 15:05:41 -07:00
Josh Poimboeuf	0696b6e314	objtool: Get rid of reloc->addend Get the addend from the embedded GElf_Rel[a] struct. With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 42.10G - After: peak heap memory consumption: 40.37G Link: https://lore.kernel.org/r/ad2354f95d9ddd86094e3f7687acfa0750657784.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>	2023-06-07 10:03:23 -07:00
Josh Poimboeuf	fcee899d27	objtool: Get rid of reloc->type Get the type from the embedded GElf_Rel[a] struct. Link: https://lore.kernel.org/r/d1c1f8da31e4f052a2478aea585fcf355cacc53a.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>	2023-06-07 10:03:22 -07:00
Josh Poimboeuf	6342a20efb	objtool: Add elf_create_section_pair() When creating an annotation section, allocate the reloc section data at the beginning. This simplifies the data model a bit and also saves memory due to the removal of malloc() in elf_rebuild_reloc_section(). With allyesconfig + CONFIG_DEBUG_INFO: - Before: peak heap memory consumption: 53.49G - After: peak heap memory consumption: 49.02G Link: https://lore.kernel.org/r/048e908f3ede9b66c15e44672b6dda992b1dae3e.1685464332.git.jpoimboe@kernel.org Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>	2023-06-07 10:03:17 -07:00
Peter Zijlstra	270a69c448	x86/alternative: Support relocations in alternatives A little while ago someone (Kirill) ran into the whole 'alternatives don't do relocations nonsense' again and I got annoyed enough to actually look at the code. Since the whole alternative machinery already fully decodes the instructions it is simple enough to adjust immediates and displacement when needed. Specifically, the immediates for IP modifying instructions (JMP, CALL, Jcc) and the displacement for RIP-relative instructions. [ bp: Massage comment some more and get rid of third loop in apply_relocation(). ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230208171431.313857925@infradead.org	2023-05-10 14:47:08 +02:00
Peter Zijlstra	3ee88df1b0	objtool: Make instruction::stack_ops a single-linked list struct instruction { struct list_head list; /* 0 16 / struct hlist_node hash; / 16 16 / struct list_head call_node; / 32 16 / struct section sec; /* 48 8 / long unsigned int offset; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / unsigned int len; / 64 4 / enum insn_type type; / 68 4 / long unsigned int immediate; / 72 8 / u16 dead_end:1; / 80: 0 2 / u16 ignore:1; / 80: 1 2 / u16 ignore_alts:1; / 80: 2 2 / u16 hint:1; / 80: 3 2 / u16 save:1; / 80: 4 2 / u16 restore:1; / 80: 5 2 / u16 retpoline_safe:1; / 80: 6 2 / u16 noendbr:1; / 80: 7 2 / u16 entry:1; / 80: 8 2 / / XXX 7 bits hole, try to pack / s8 instr; / 82 1 / u8 visited; / 83 1 / / XXX 4 bytes hole, try to pack / struct alt_group alt_group; /* 88 8 / struct symbol call_dest; /* 96 8 / struct instruction jump_dest; /* 104 8 / struct instruction first_jump_src; /* 112 8 / struct reloc jump_table; /* 120 8 / / --- cacheline 2 boundary (128 bytes) --- / struct reloc reloc; /* 128 8 / struct list_head alts; / 136 16 / struct symbol sym; /* 152 8 / - struct list_head stack_ops; / 160 16 / - struct cfi_state cfi; /* 176 8 / + struct stack_op stack_ops; /* 160 8 / + struct cfi_state cfi; /* 168 8 / - / size: 184, cachelines: 3, members: 29 / - / sum members: 178, holes: 1, sum holes: 4 / + / size: 176, cachelines: 3, members: 29 / + / sum members: 170, holes: 1, sum holes: 4 / / sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 7 bits / - / last cacheline: 56 bytes / + / last cacheline: 48 bytes */ }; pre: 5:58.22 real, 226.69 user, 131.22 sys, 26221520 mem post: 5:58.50 real, 229.64 user, 128.65 sys, 26221520 mem Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build only Tested-by: Thomas Weißschuh <linux@weissschuh.net> # compile and run Link: https://lore.kernel.org/r/20230208172245.362196959@infradead.org	2023-02-23 09:20:59 +01:00

1 2 3 4 5

121 Commits