Commit Graph

394 Commits

Author SHA1 Message Date
Masahiro Yamada
28438794ab modpost: fix section mismatch check for exported init/exit sections
Since commit f02e8a6596 ("module: Sort exported symbols"),
EXPORT_SYMBOL* is placed in the individual section ___ksymtab(_gpl)+<sym>
(3 leading underscores instead of 2).

Since then, modpost cannot detect the bad combination of EXPORT_SYMBOL
and __init/__exit.

Fix the .fromsec field.

Fixes: f02e8a6596 ("module: Sort exported symbols")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-06-20 08:18:03 +09:00
Masahiro Yamada
a89227d769 modpost: use fnmatch() to simplify match()
Replace the own implementation for wildcard (glob) matching with
a function call to fnmatch().

Also, change the return type to 'bool'.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-06-05 06:20:57 +09:00
Masahiro Yamada
8c9ce89c5b modpost: simplify mod->name allocation
mod->name is set to the ELF filename with the suffix ".o" stripped.

The current code calls strdup() and free() to manipulate the string,
but a simpler approach is to pass new_module() with the name length
subtracted by 2.

Also, check if the passed filename ends with ".o" before stripping it.

The current code blindly chops the suffix:

    tmp[strlen(tmp) - 2] = '\0'

It will cause buffer under-run if strlen(tmp) < 2;

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-06-05 06:20:57 +09:00
Masahiro Yamada
31cb50b559 kbuild: check static EXPORT_SYMBOL* by script instead of modpost
The 'static' specifier and EXPORT_SYMBOL() are an odd combination.

Commit 15bfc2348d ("modpost: check for static EXPORT_SYMBOL*
functions") tried to detect it, but this check has false negatives.

Here is the sample code.

  Makefile:

    obj-y += foo1.o foo2.o

  foo1.c:

    #include <linux/export.h>
    static void foo(void) {}
    EXPORT_SYMBOL(foo);

  foo2.c:

    void foo(void) {}

foo1.c exports the static symbol 'foo', but modpost cannot catch it
because it is fooled by foo2.c, which has a global symbol with the
same name.

s->is_static is cleared if a global symbol with the same name is found
somewhere, but EXPORT_SYMBOL() and the global symbol do not necessarily
belong to the same compilation unit.

This check should be done per compilation unit, but I do not know how
to do it in modpost. modpost runs against vmlinux.o or modules, which
merges multiple objects, then forgets their origin.

modpost cannot parse individual objects because they may not be ELF but
LLVM IR when CONFIG_LTO_CLANG=y.

Add a simple bash script to parse the output from ${NM}. This works for
CONFIG_LTO_CLANG=y because llvm-nm can dump symbols of LLVM IR files.

Revert 15bfc2348d.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-06-01 23:07:02 +09:00
Masahiro Yamada
c25e1c5582 kbuild: do not create *.prelink.o for Clang LTO or IBT
When CONFIG_LTO_CLANG=y, additional intermediate *.prelink.o is created
for each module. Also, objtool is postponed until LLVM IR is converted
to ELF.

CONFIG_X86_KERNEL_IBT works in a similar way to postpone objtool until
objects are merged together.

This commit stops generating *.prelink.o, so the build flow will look
similar with/without LTO.

The following figures show how the LTO build currently works, and
how this commit is changing it.

Current build flow
==================

 [1] single-object module

                                      $(LD)
           $(CC)                     +objtool              $(LD)
    foo.c --------------------> foo.o -----> foo.prelink.o -----> foo.ko
                              (LLVM IR)          (ELF)       |    (ELF)
                                                             |
                                                 foo.mod.o --/
                                                 (LLVM IR)

 [2] multi-object module
                                      $(LD)
           $(CC)         $(AR)       +objtool               $(LD)
    foo1.c -----> foo1.o -----> foo.o -----> foo.prelink.o -----> foo.ko
                           |  (archive)          (ELF)       |    (ELF)
    foo2.c -----> foo2.o --/                                 |
                 (LLVM IR)                       foo.mod.o --/
                                                 (LLVM IR)

  One confusion is that foo.o in multi-object module is an archive
  despite of its suffix.

New build flow
==============

 [1] single-object module

  Since there is only one object, there is no need to keep the LLVM IR.
  Use $(CC)+$(LD) to generate an ELF object in one build rule. When LTO
  is disabled, $(LD) is unneeded because $(CC) produces an ELF object.

               $(CC)+$(LD)+objtool              $(LD)
    foo.c ----------------------------> foo.o ---------> foo.ko
                                        (ELF)     |      (ELF)
                                                  |
                                      foo.mod.o --/
                                      (LLVM IR)

 [2] multi-object module

  Previously, $(AR) was used to combine LLVM IR files into an archive,
  but there was no technical reason to do so. Use $(LD) to merge them
  into a single ELF object.

                               $(LD)
             $(CC)            +objtool          $(LD)
    foo1.c ---------> foo1.o ---------> foo.o ---------> foo.ko
                                 |      (ELF)     |      (ELF)
    foo2.c ---------> foo2.o ----/                |
                     (LLVM IR)        foo.mod.o --/
                                      (LLVM IR)

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
2022-05-29 18:39:35 +09:00
Masahiro Yamada
68fef6704e modpost: squash if...else-if in find_elf_symbol2()
if ((addr - sym->st_value) < distance) {
            distance = addr - sym->st_value;
            near = sym;
    } else if ((addr - sym->st_value) == distance) {
            near = sym;
    }

is equivalent to:

    if (addr - sym->st_value <= distance) {
            distance = addr - sym->st_value;
            near = sym;
    }

(The else-if block can overwrite 'distance' with the same value).

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27 16:16:35 +09:00
Masahiro Yamada
c5c468dcc2 modpost: reuse ARRAY_SIZE() macro for section_mismatch()
Move ARRAY_SIZE() from file2alias.c to modpost.h to reuse it in
section_mismatch().

Also, move the variable 'check' inside the for-loop.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27 16:15:40 +09:00
Masahiro Yamada
76954527fe modpost: remove the unused argument of check_sec_ref()
check_sec_ref() does not use the first parameter 'mod'.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27 16:10:10 +09:00
Masahiro Yamada
d6b732666a modpost: fix undefined behavior of is_arm_mapping_symbol()
The return value of is_arm_mapping_symbol() is unpredictable when "$"
is passed in.

strchr(3) says:
  The strchr() and strrchr() functions return a pointer to the matched
  character or NULL if the character is not found. The terminating null
  byte is considered part of the string, so that if c is specified as
  '\0', these functions return a pointer to the terminator.

When str[1] is '\0', strchr("axtd", str[1]) is not NULL, and str[2] is
referenced (i.e. buffer overrun).

Test code
---------

  char str1[] = "abc";
  char str2[] = "ab";

  strcpy(str1, "$");
  strcpy(str2, "$");

  printf("test1: %d\n", is_arm_mapping_symbol(str1));
  printf("test2: %d\n", is_arm_mapping_symbol(str2));

Result
------

  test1: 0
  test2: 1

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-27 16:07:44 +09:00
Alexander Lobakin
b5beffa20d modpost: fix removing numeric suffixes
With the `-z unique-symbol` linker flag or any similar mechanism,
it is possible to trigger the following:

ERROR: modpost: "param_set_uint.0" [vmlinux] is a static EXPORT_SYMBOL

The reason is that for now the condition from remove_dot():

if (m && (s[n + m] == '.' || s[n + m] == 0))

which was designed to test if it's a dot or a '\0' after the suffix
is never satisfied.
This is due to that `s[n + m]` always points to the last digit of a
numeric suffix, not on the symbol next to it (from a custom debug
print added to modpost):

param_set_uint.0, s[n + m] is '0', s[n + m + 1] is '\0'

So it's off-by-one and was like that since 2014.

Fix this for the sake of any potential upcoming features, but don't
bother stable-backporting, as it's well hidden -- apart from that
LD flag, it can be triggered only with GCC LTO which never landed
upstream.

Fixes: fcd38ed0ff ("scripts: modpost: fix compilation warning")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-05-27 16:05:01 +09:00
Masahiro Yamada
7b4537199a kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS
include/{linux,asm-generic}/export.h defines a weak symbol, __crc_*
as a placeholder.

Genksyms writes the version CRCs into the linker script, which will be
used for filling the __crc_* symbols. The linker script format depends
on CONFIG_MODULE_REL_CRCS. If it is enabled, __crc_* holds the offset
to the reference of CRC.

It is time to get rid of this complexity.

Now that modpost parses text files (.*.cmd) to collect all the CRCs,
it can generate C code that will be linked to the vmlinux or modules.

Generate a new C file, .vmlinux.export.c, which contains the CRCs of
symbols exported by vmlinux. It is compiled and linked to vmlinux in
scripts/link-vmlinux.sh.

Put the CRCs of symbols exported by modules into the existing *.mod.c
files. No additional build step is needed for modules. As before,
*.mod.c are compiled and linked to *.ko in scripts/Makefile.modfinal.

No linker magic is used here. The new C implementation works in the
same way, whether CONFIG_RELOCATABLE is enabled or not.
CONFIG_MODULE_REL_CRCS is no longer needed.

Previously, Kbuild invoked additional $(LD) to update the CRCs in
objects, but this step is unneeded too.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nicolas Schier <nicolas@fjasle.eu>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-05-24 16:33:20 +09:00
Masahiro Yamada
f292d875d0 modpost: extract symbol versions from *.cmd files
Currently, CONFIG_MODVERSIONS needs extra link to embed the symbol
versions into ELF objects. Then, modpost extracts the version CRCs
from them.

The following figures show how it currently works, and how I am trying
to change it.

Current implementation
======================
                                                           |----------|
                 embed CRC      -------------------------->| final    |
       $(CC)       $(LD)       /  |---------|              | link for |
       -----> *.o -------> *.o -->| modpost |              | vmlinux  |
      /              /            |         |-- *.mod.c -->| or       |
     / genksyms     /             |---------|              | module   |
  *.c ------> *.symversions                                |----------|

Genksyms outputs the calculated CRCs in the form of linker script
(*.symversions), which is used by $(LD) to update the object.

If CONFIG_LTO_CLANG=y, the build process is much more complex. Embedding
the CRCs is postponed until the LLVM bitcode is converted into ELF,
creating another intermediate *.prelink.o.

However, this complexity is unneeded. There is no reason why we must
embed version CRCs in objects so early.

There is final link stage for vmlinux (scripts/link-vmlinux.sh) and
modules (scripts/Makefile.modfinal). We can link CRCs at the very last
moment.

New implementation
==================
                                                           |----------|
                   --------------------------------------->| final    |
       $(CC)      /    |---------|                         | link for |
       -----> *.o ---->|         |                         | vmlinux  |
      /                | modpost |--- .vmlinux.export.c -->| or       |
     / genksyms        |         |--- *.mod.c ------------>| module   |
  *.c ------> *.cmd -->|---------|                         |----------|

Pass the symbol versions to modpost as separate text data, which are
available in *.cmd files.

This commit changes modpost to extract CRCs from *.cmd files instead of
from ELF objects.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-05-24 00:53:06 +09:00
Masahiro Yamada
69c4cc99bb modpost: add sym_find_with_module() helper
find_symbol() returns the first symbol found in the hash table. This
table is global, so it may return a symbol from an unexpected module.

There is a case where we want to search for a symbol with a given name
in a specified module.

Add sym_find_with_module(), which receives the module pointer as the
second argument. It is equivalent to find_module() if NULL is passed
as the module pointer.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
2022-05-24 00:52:12 +09:00
Masahiro Yamada
2a66c3124a modpost: change the license of EXPORT_SYMBOL to bool type
There were more EXPORT_SYMBOL types in the past. The following commits
removed unused ones.

 - f1c3d73e97 ("module: remove EXPORT_SYMBOL_GPL_FUTURE")
 - 367948220f ("module: remove EXPORT_UNUSED_SYMBOL*")

There are 3 remaining in enum export, but export_unknown does not make
any sense because we never expect such a situation like "we do not know
how it was exported".

If the symbol name starts with "__ksymtab_", but the section name
does not start with "___ksymtab+" or "___ksymtab_gpl+", it is not an
exported symbol.

It occurs when a variable starting with "__ksymtab_" is directly defined:

   int __ksymtab_foo;

Presumably, there is no practical issue for using such a weird variable
name (but there is no good reason for doing so, either).

Anyway, that is not an exported symbol. Setting export_unknown is not
the right thing to do. Do not call sym_add_exported() in this case.

With pointless export_unknown removed, the export type finally becomes
boolean (either EXPORT_SYMBOL or EXPORT_SYMBOL_GPL).

I renamed the field name to is_gpl_only. EXPORT_SYMBOL_GPL sets it true.
Only GPL-compatible modules can use it.

I removed the orphan comment, "How a symbol is exported", which is
unrelated to sec_mismatch_count. It is about enum export.
See commit bd5cbcedf4 ("kbuild: export-type enhancement to modpost.c")

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
2022-05-11 21:46:39 +09:00
Masahiro Yamada
a44abaca0e modpost: move *.mod.c generation to write_mod_c_files()
A later commit will add more code to this list_for_each_entry loop.

Before that, move the loop body into a separate helper function.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
2022-05-11 21:46:38 +09:00
Masahiro Yamada
7fedac9698 modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header
add_intree_flag(), add_retpoline(), and add_staging_flag() are small
enough to be merged into add_header().

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
2022-05-11 21:46:38 +09:00
Masahiro Yamada
f18379a302 modpost: split new_symbol() to symbol allocation and hash table addition
new_symbol() does two things; allocate a new symbol and register it
to the hash table.

Using a separate function for each is easier to understand.

Replace new_symbol() with hash_add_symbol(). Remove the second parameter
of alloc_symbol().

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:01 +09:00
Masahiro Yamada
e76cc48d8e modpost: make sym_add_exported() always allocate a new symbol
Currently, sym_add_exported() does not allocate a symbol if the same
name symbol already exists in the hash table.

This does not reflect the real use cases. You can let an external
module override the in-tree one. In this case, the external module
will export the same name symbols as the in-tree one. However,
modpost simply ignores those symbols, then Module.symvers for the
external module loses its symbols.

sym_add_exported() should allocate a new symbol.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:01 +09:00
Masahiro Yamada
b842271108 modpost: make multiple export error
This is currently a warning, but I think modpost should stop building
in this case.

If the same symbol is exported multiple times and we let it keep going,
the sanity check becomes difficult.

Only the legitimate case is that an external module overrides the
corresponding in-tree module to provide a different implementation
with the same interface.

Also, there exists an upstream example that exploits this feature.

  $ make M=tools/testing/nvdimm

... builds tools/testing/nvdimm/libnvdimm.ko. This is a mocked module
that overrides the symbols from drivers/nvdimm/libnvdimm.ko.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:01 +09:00
Masahiro Yamada
f841536e8c modpost: dump Module.symvers in the same order of modules.order
modpost dumps the exported symbols into Module.symvers, but currently
in random order because it iterates in the hash table.

Add a linked list of exported symbols in struct module, so we can
iterate on symbols per module.

This commit makes Module.symvers much more readable; the outer loop in
write_dump() iterates over the modules in the order of modules.order,
and the inner loop dumps symbols in each module.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:01 +09:00
Masahiro Yamada
ab489d6002 modpost: traverse the namespace_list in order
Use the doubly linked list to traverse the list in the added order.
This makes the code more consistent.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:00 +09:00
Masahiro Yamada
4484054816 modpost: use doubly linked list for dump_lists
This looks easier to understand (just because this is a pattern in
the kernel code). No functional change is intended.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:00 +09:00
Masahiro Yamada
8a69152be9 modpost: traverse unresolved symbols in order
Currently, modpost manages unresolved in a singly linked list; it adds
a new node to the head, and traverses the list from new to old.

Use a doubly linked list to keep the order in the symbol table in the
ELF file.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:00 +09:00
Masahiro Yamada
e882e89bcf modpost: add sym_add_unresolved() helper
Add a small helper, sym_add_unresolved() to ease the further
refactoring.

Remove the 'weak' argument from alloc_symbol() because it is sensible
only for unresolved symbols.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:00 +09:00
Masahiro Yamada
325eba05e8 modpost: traverse modules in order
Currently, modpost manages modules in a singly linked list; it adds
a new node to the head, and traverses the list from new to old.

It works, but the error messages are shown in the reverse order.

If you have a Makefile like this:

  obj-m += foo.o bar.o

then, modpost shows error messages in bar.o, foo.o, in this order.

Use a doubly linked list to keep the order in modules.order; use
list_add_tail() for the node addition and list_for_each_entry() for
the list traverse.

Now that the kernel's list macros have been imported to modpost, I will
use them actively going forward.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
2022-05-08 03:17:00 +09:00