Pull objtool updates from Ingo Molnar:
- Mark arch_cpu_idle_dead() __noreturn, make all architectures &
drivers that did this inconsistently follow this new, common
convention, and fix all the fallout that objtool can now detect
statically
- Fix/improve the ORC unwinder becoming unreliable due to
UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
and UNWIND_HINT_UNDEFINED to resolve it
- Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code
- Generate ORC data for __pfx code
- Add more __noreturn annotations to various kernel startup/shutdown
and panic functions
- Misc improvements & fixes
* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
x86/hyperv: Mark hv_ghcb_terminate() as noreturn
scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
btrfs: Mark btrfs_assertfail() __noreturn
objtool: Include weak functions in global_noreturns check
cpu: Mark nmi_panic_self_stop() __noreturn
cpu: Mark panic_smp_self_stop() __noreturn
arm64/cpu: Mark cpu_park_loop() and friends __noreturn
x86/head: Mark *_start_kernel() __noreturn
init: Mark start_kernel() __noreturn
init: Mark [arch_call_]rest_init() __noreturn
objtool: Generate ORC data for __pfx code
x86/linkage: Fix padding for typed functions
objtool: Separate prefix code from stack validation code
objtool: Remove superfluous dead_end_function() check
objtool: Add symbol iteration helpers
objtool: Add WARN_INSN()
scripts/objdump-func: Support multiple functions
context_tracking: Fix KCSAN noinstr violation
objtool: Add stackleak instrumentation to uaccess safe list
...
Mark reported that the ORC unwinder incorrectly marks an unwind as
reliable when the unwind terminates prematurely in the dark corners of
return_to_handler() due to lack of information about the next frame.
The problem is UNWIND_HINT_EMPTY is used in two different situations:
1) The end of the kernel stack unwind before hitting user entry, boot
code, or fork entry
2) A blind spot in ORC coverage where the unwinder has to bail due to
lack of information about the next frame
The ORC unwinder has no way to tell the difference between the two.
When it encounters an undefined stack state with 'end=1', it blindly
marks the stack reliable, which can break the livepatch consistency
model.
Fix it by splitting UNWIND_HINT_EMPTY into UNWIND_HINT_UNDEFINED and
UNWIND_HINT_END_OF_STACK.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/fd6212c8b450d3564b855e1cb48404d6277b4d9f.1677683419.git.jpoimboe@kernel.org
Refer to klp_modinfo declaration using kdoc.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Adjust documents to the file moves made by commit cfc1d27789 ("module:
Move all into module/").
Thanks to Yanteng Si for helping me to update
Documentation/translations/zh_CN/core-api/kernel-api.rst
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
The livepatch subsystem has several exported functions and objects with
kerneldoc comments. Though the livepatch documentation contains handwritten
descriptions of all of these exported functions, they are currently not
pulled into the docs build using the kernel-doc directive.
In order to allow readers of the documentation to see the full kerneldoc
comments in the generated documentation files, this change adds a new
Documentation/livepatch/api.rst page which contains kernel-doc directives
to link the kerneldoc comments directly in the documentation. With this,
all of the hand-written descriptions of the APIs now cross-reference the
kerneldoc comments on the new Livepatching APIs page, and running
./scripts/find-unused-docs.sh on kernel/livepatch no longer shows any files
as missing documentation.
Note that all of the handwritten API descriptions were left alone with the
exception of Documentation/livepatch/system-state.rst, which was updated to
allow the cross-referencing to work correctly. The file now follows the
cross-referencing formatting guidance specified in
Documentation/doc-guide/kernel-doc.rst. Furthermore, some comments around
klp_shadow_free_all() were updated to say <_, id> rather than <*, id> to
match the rest of the file, and to prevent the docs build from emitting an
"Inline emphasis start-string without end string" error.
Signed-off-by: David Vernet <void@manifault.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211221145743.4098360-1-void@manifault.com
Add documentation for reliable stacktrace. This is intended to describe
the semantics and to be an aid for implementing architecture support for
HAVE_RELIABLE_STACKTRACE.
Unwinding is a subtle area, and architectures vary greatly in both
implementation and the set of concerns that affect them, so I've tried
to avoid making this too specific to any given architecture. I've used
examples from both x86_64 and arm64 to explain corner cases in more
detail, but I've tried to keep the descriptions sufficient for those who
are unfamiliar with the particular architecture.
This document aims to give rationale for all the recommendations and
requirements, since that makes it easier to spot nearby issues, or when
a check happens to catch a few things at once.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
[Updates following review -- broonie]
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Automatically generate the tables of contents for livepatch documentation
files that have tables of contents rather than open coding them so things
are a little easier to maintain.
Signed-off-by: Mark Brown <broonie@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
After the previous patch, vmlinux-specific KLP relocations are now
applied early during KLP module load. This means that .klp.arch
sections are no longer needed for *vmlinux-specific* KLP relocations.
One might think they're still needed for *module-specific* KLP
relocations. If a to-be-patched module is loaded *after* its
corresponding KLP module is loaded, any corresponding KLP relocations
will be delayed until the to-be-patched module is loaded. If any
special sections (.parainstructions, for example) rely on those
relocations, their initializations (apply_paravirt) need to be done
afterwards. Thus the apparent need for arch_klp_init_object_loaded()
and its corresponding .klp.arch sections -- it allows some of the
special section initializations to be done at a later time.
But... if you look closer, that dependency between the special sections
and the module-specific KLP relocations doesn't actually exist in
reality. Looking at the contents of the .altinstructions and
.parainstructions sections, there's not a realistic scenario in which a
KLP module's .altinstructions or .parainstructions section needs to
access a symbol in a to-be-patched module. It might need to access a
local symbol or even a vmlinux symbol; but not another module's symbol.
When a special section needs to reference a local or vmlinux symbol, a
normal rela can be used instead of a KLP rela.
Since the special section initializations don't actually have any real
dependency on module-specific KLP relocations, .klp.arch and
arch_klp_init_object_loaded() no longer have a reason to exist. So
remove them.
As Peter said much more succinctly:
So the reason for .klp.arch was that .klp.rela.* stuff would overwrite
paravirt instructions. If that happens you're doing it wrong. Those
RELAs are core kernel, not module, and thus should've happened in
.rela.* sections at patch-module loading time.
Reverting this removes the two apply_{paravirt,alternatives}() calls
from the late patching path, and means we don't have to worry about
them when removing module_disable_ro().
[ jpoimboe: Rewrote patch description. Tweaked klp_init_object_loaded()
error path. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The contents of those directories were orphaned at the documentation
body.
While those directories could likely be moved to be inside some guide,
I'm opting to just adding their indexes to the main one, removing the
:orphan: and adding the SPDX header.
For the drivers, the rationale is that the documentation contains
a mix of Kernelspace, uAPI and admin-guide. So, better to keep them on
separate directories, as we've be doing with similar subsystem-specific
docs that were not split yet.
For the others, well... I'm too lazy to do the move. Also, it
seems to make sense to keep at least some of those at the main
dir (like kbuild, for example). In any case, a latter patch
could do the move.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Make the structure of "Livepatch module Elf format" document similar
to the main "Livepatch" document.
Also make the structure of "(Un)patching Callbacks" document similar
to the "Shadow Variables" document.
It fixes the most visible inconsistencies of the documentation
generated from the ReST format.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Reviewed-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Convert livepatch documentation to ReST format. The changes
are mostly trivial, as the documents are already on a good
shape. Just a few markup changes are needed for Sphinx to
properly parse the docs.
The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- The in-file TOC becomes a comment, in order to skip it from the
output, as Sphinx already generates an index there.
- adjust title markups.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The fake signal is send automatically now. We can rely on it completely
and remove the sysfs attribute.
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
An administrator may send a fake signal to all remaining blocking tasks
of a running transition by writing to
/sys/kernel/livepatch/<patch>/signal attribute. Let's do it
automatically after 15 seconds. The timeout is chosen deliberately. It
gives the tasks enough time to transition themselves.
Theoretically, sending it once should be more than enough. However,
every task must get outside of a patched function to be successfully
transitioned. It could prove not to be simple and resending could be
helpful in that case.
A new workqueue job could be a cleaner solution to achieve it, but it
could also introduce deadlocks and cause more headaches with
synchronization and cancelling.
[jkosina@suse.cz: removed added newline]
Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add a few livepatch modules and simple target modules that the included
regression suite can run tests against:
- basic livepatching (multiple patches, atomic replace)
- pre/post (un)patch callbacks
- shadow variable API
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Tested-by: Miroslav Benes <mbenes@suse.cz>
Tested-by: Alice Ferrazzi <alice.ferrazzi@gmail.com>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The atomic replace and cumulative patches were introduced as a more secure
way to handle dependent patches. They simplify the logic:
+ Any new cumulative patch is supposed to take over shadow variables
and changes made by callbacks from previous livepatches.
+ All replaced patches are discarded and the modules can be unloaded.
As a result, there is only one scenario when a cumulative livepatch
gets disabled.
The different handling of "normal" and cumulative patches might cause
confusion. It would make sense to keep only one mode. On the other hand,
it would be rude to enforce using the cumulative livepatches even for
trivial and independent (hot) fixes.
However, the stack of patches is not really necessary any longer.
The patch ordering was never clearly visible via the sysfs interface.
Also the "normal" patches need a lot of caution anyway.
Note that the list of enabled patches is still necessary but the ordering
is not longer enforced.
Otherwise, the code is ready to disable livepatches in an random order.
Namely, klp_check_stack_func() always looks for the function from
the livepatch that is being disabled. klp_func structures are just
removed from the related func_stack. Finally, the ftrace handlers
is removed only when the func_stack becomes empty.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
User documentation for the atomic replace feature. It makes it easier
to maintain livepatches using so-called cumulative patches.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Sometimes we would like to revert a particular fix. Currently, this
is not easy because we want to keep all other fixes active and we
could revert only the last applied patch.
One solution would be to apply new patch that implemented all
the reverted functions like in the original code. It would work
as expected but there will be unnecessary redirections. In addition,
it would also require knowing which functions need to be reverted at
build time.
Another problem is when there are many patches that touch the same
functions. There might be dependencies between patches that are
not enforced on the kernel side. Also it might be pretty hard to
actually prepare the patch and ensure compatibility with the other
patches.
Atomic replace && cumulative patches:
A better solution would be to create cumulative patch and say that
it replaces all older ones.
This patch adds a new "replace" flag to struct klp_patch. When it is
enabled, a set of 'nop' klp_func will be dynamically created for all
functions that are already being patched but that will no longer be
modified by the new patch. They are used as a new target during
the patch transition.
The idea is to handle Nops' structures like the static ones. When
the dynamic structures are allocated, we initialize all values that
are normally statically defined.
The only exception is "new_func" in struct klp_func. It has to point
to the original function and the address is known only when the object
(module) is loaded. Note that we really need to set it. The address is
used, for example, in klp_check_stack_func().
Nevertheless we still need to distinguish the dynamically allocated
structures in some operations. For this, we add "nop" flag into
struct klp_func and "dynamic" flag into struct klp_object. They
need special handling in the following situations:
+ The structures are added into the lists of objects and functions
immediately. In fact, the lists were created for this purpose.
+ The address of the original function is known only when the patched
object (module) is loaded. Therefore it is copied later in
klp_init_object_loaded().
+ The ftrace handler must not set PC to func->new_func. It would cause
infinite loop because the address points back to the beginning of
the original function.
+ The various free() functions must free the structure itself.
Note that other ways to detect the dynamic structures are not considered
safe. For example, even the statically defined struct klp_object might
include empty funcs array. It might be there just to run some callbacks.
Also note that the safe iterator must be used in the free() functions.
Otherwise already freed structures might get accessed.
Special callbacks handling:
The callbacks from the replaced patches are _not_ called by intention.
It would be pretty hard to define a reasonable semantic and implement it.
It might even be counter-productive. The new patch is cumulative. It is
supposed to include most of the changes from older patches. In most cases,
it will not want to call pre_unpatch() post_unpatch() callbacks from
the replaced patches. It would disable/break things for no good reasons.
Also it should be easier to handle various scenarios in a single script
in the new patch than think about interactions caused by running many
scripts from older patches. Not to say that the old scripts even would
not expect to be called in this situation.
Removing replaced patches:
One nice effect of the cumulative patches is that the code from the
older patches is no longer used. Therefore the replaced patches can
be removed. It has several advantages:
+ Nops' structs will no longer be necessary and might be removed.
This would save memory, restore performance (no ftrace handler),
allow clear view on what is really patched.
+ Disabling the patch will cause using the original code everywhere.
Therefore the livepatch callbacks could handle only one scenario.
Note that the complication is already complex enough when the patch
gets enabled. It is currently solved by calling callbacks only from
the new cumulative patch.
+ The state is clean in both the sysfs interface and lsmod. The modules
with the replaced livepatches might even get removed from the system.
Some people actually expected this behavior from the beginning. After all
a cumulative patch is supposed to "completely" replace an existing one.
It is like when a new version of an application replaces an older one.
This patch does the first step. It removes the replaced patches from
the list of patches. It is safe. The consistency model ensures that
they are no longer used. By other words, each process works only with
the structures from klp_transition_patch.
The removal is done by a special function. It combines actions done by
__disable_patch() and klp_complete_transition(). But it is a fast
track without all the transaction-related stuff.
Signed-off-by: Jason Baron <jbaron@akamai.com>
[pmladek@suse.com: Split, reuse existing code, simplified]
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The possibility to re-enable a registered patch was useful for immediate
patches where the livepatch module had to stay until the system reboot.
The improved consistency model allows to achieve the same result by
unloading and loading the livepatch module again.
Also we are going to add a feature called atomic replace. It will allow
to create a patch that would replace all already registered patches.
The aim is to handle dependent patches more securely. It will obsolete
the stack of patches that helped to handle the dependencies so far.
Then it might be unclear when a cumulative patch re-enabling is safe.
It would be complicated to support the many modes. Instead we could
actually make the API and code easier to understand.
Therefore, remove the two step public API. All the checks and init calls
are moved from klp_register_patch() to klp_enabled_patch(). Also the patch
is automatically freed, including the sysfs interface when the transition
to the disabled state is completed.
As a result, there is never a disabled patch on the top of the stack.
Therefore we do not need to check the stack in __klp_enable_patch().
And we could simplify the check in __klp_disable_patch().
Also the API and logic is much easier. It is enough to call
klp_enable_patch() in module_init() call. The patch can be disabled
by writing '0' into /sys/kernel/livepatch/<patch>/enabled. Then the module
can be removed once the transition finishes and sysfs interface is freed.
The only problem is how to free the structures and kobjects safely.
The operation is triggered from the sysfs interface. We could not put
the related kobject from there because it would cause lock inversion
between klp_mutex and kernfs locks, see kn->count lockdep map.
Therefore, offload the free task to a workqueue. It is perfectly fine:
+ The patch can no longer be used in the livepatch operations.
+ The module could not be removed until the free operation finishes
and module_put() is called.
+ The operation is asynchronous already when the first
klp_try_complete_transition() fails and another call
is queued with a delay.
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Semantic changes are possible since the commit d83a7cb375
("livepatch: change to a per-task consistency model").
Also data structures can be patched since the commit 439e7271dc
("livepatch: introduce shadow variable API").
It is a high time we removed these limitations from the documentation.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
We might need to do some actions before the shadow variable is freed.
For example, we might need to remove it from a list or free some data
that it points to.
This is already possible now. The user can get the shadow variable
by klp_shadow_get(), do the necessary actions, and then call
klp_shadow_free().
This patch allows to do it a more elegant way. The user could implement
the needed actions in a callback that is passed to klp_shadow_free()
as a parameter. The callback usually does reverse operations to
the constructor callback that can be called by klp_shadow_*alloc().
It is especially useful for klp_shadow_free_all(). There we need to do
these extra actions for each found shadow variable with the given ID.
Note that the memory used by the shadow variable itself is still released
later by rcu callback. It is needed to protect internal structures that
keep all shadow variables. But the destructor is called immediately.
The shadow variable must not be access anyway after klp_shadow_free()
is called. The user is responsible to protect this any suitable way.
Be aware that the destructor is called under klp_shadow_lock. It is
the same as for the contructor in klp_shadow_alloc().
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The existing API allows to pass a sample data to initialize the shadow
data. It works well when the data are position independent. But it fails
miserably when we need to set a pointer to the shadow structure itself.
Unfortunately, we might need to initialize the pointer surprisingly
often because of struct list_head. It is even worse because the list
might be hidden in other common structures, for example, struct mutex,
struct wait_queue_head.
For example, this was needed to fix races in ALSA sequencer. It required
to add mutex into struct snd_seq_client. See commit b3defb791b
("ALSA: seq: Make ioctls race-free") and commit d15d662e89
("ALSA: seq: Fix racy pool initializations")
This patch makes the API more safe. A custom constructor function and data
are passed to klp_shadow_*alloc() functions instead of the sample data.
Note that ctor_data are no longer a template for shadow->data. It might
point to any data that might be necessary when the constructor is called.
Also note that the constructor is called under klp_shadow_lock. It is
an internal spin_lock that synchronizes alloc() vs. get() operations,
see klp_shadow_get_or_alloc(). On one hand, this adds a risk of ABBA
deadlocks. On the other hand, it allows to do some operations safely.
For example, we could add the new structure into an existing list.
This must be done only once when the structure is allocated.
Reported-by: Nicolai Stange <nstange@suse.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>