Commit Graph

103 Commits

Author SHA1 Message Date
Ashok Raj
cf5ab01c87 x86/microcode/intel: Add a minimum required revision for late loading
In general users, don't have the necessary information to determine
whether late loading of a new microcode version is safe and does not
modify anything which the currently running kernel uses already, e.g.
removal of CPUID bits or behavioural changes of MSRs.

To address this issue, Intel has added a "minimum required version"
field to a previously reserved field in the microcode header.  Microcode
updates should only be applied if the current microcode version is equal
to, or greater than this minimum required version.

Thomas made some suggestions on how meta-data in the microcode file could
provide Linux with information to decide if the new microcode is suitable
candidate for late loading. But even the "simpler" option requires a lot of
metadata and corresponding kernel code to parse it, so the final suggestion
was to add the 'minimum required version' field in the header.

When microcode changes visible features, microcode will set the minimum
required version to its own revision which prevents late loading.

Old microcode blobs have the minimum revision field always set to 0, which
indicates that there is no information and the kernel considers it
unsafe.

This is a pure OS software mechanism. The hardware/firmware ignores this
header field.

For early loading there is no restriction because OS visible features
are enumerated after the early load and therefore a change has no
effect.

The check is always enabled, but by default not enforced. It can be
enforced via Kconfig or kernel command line.

If enforced, the kernel refuses to late load microcode with a minimum
required version field which is zero or when the currently loaded
microcode revision is smaller than the minimum required revision.

If not enforced the load happens independent of the revision check to
stay compatible with the existing behaviour, but it influences the
decision whether the kernel is tainted or not. If the check signals that
the late load is safe, then the kernel is not tainted.

Early loading is not affected by this.

[ tglx: Massaged changelog and fixed up the implementation ]

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115903.776467264@linutronix.de
2023-10-24 15:05:55 +02:00
Thomas Gleixner
9407bda845 x86/microcode: Prepare for minimal revision check
Applying microcode late can be fatal for the running kernel when the
update changes functionality which is in use already in a non-compatible
way, e.g. by removing a CPUID bit.

There is no way for admins which do not have access to the vendors deep
technical support to decide whether late loading of such a microcode is
safe or not.

Intel has added a new field to the microcode header which tells the
minimal microcode revision which is required to be active in the CPU in
order to be safe.

Provide infrastructure for handling this in the core code and a command
line switch which allows to enforce it.

If the update is considered safe the kernel is not tainted and the annoying
warning message not emitted. If it's enforced and the currently loaded
microcode revision is not safe for late loading then the load is aborted.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211724.079611170@linutronix.de
2023-10-24 15:05:55 +02:00
Thomas Gleixner
7eb314a228 x86/microcode: Rendezvous and load in NMI
stop_machine() does not prevent the spin-waiting sibling from handling
an NMI, which is obviously violating the whole concept of rendezvous.

Implement a static branch right in the beginning of the NMI handler
which is nopped out except when enabled by the late loading mechanism.

The late loader enables the static branch before stop_machine() is
invoked. Each CPU has an nmi_enable in its control structure which
indicates whether the CPU should go into the update routine.

This is required to bridge the gap between enabling the branch and
actually being at the point where it is required to enter the loader
wait loop.

Each CPU which arrives in the stopper thread function sets that flag and
issues a self NMI right after that. If the NMI function sees the flag
clear, it returns. If it's set it clears the flag and enters the
rendezvous.

This is safe against a real NMI which hits in between setting the flag
and sending the NMI to itself. The real NMI will be swallowed by the
microcode update and the self NMI will then let stuff continue.
Otherwise this would end up with a spurious NMI.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115903.489900814@linutronix.de
2023-10-24 15:05:55 +02:00
Thomas Gleixner
b7fcd995b2 x86/microcode/intel: Rework intel_find_matching_signature()
Take a cpu_signature argument and work from there. Move the match()
helper next to the callsite as there is no point for having it in
a header.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.797820205@linutronix.de
2023-10-24 15:05:54 +02:00
Thomas Gleixner
11f96ac4c2 x86/microcode/intel: Reuse intel_cpu_collect_info()
No point for an almost duplicate function.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.741173606@linutronix.de
2023-10-24 15:05:54 +02:00
Thomas Gleixner
164aa1ca53 x86/microcode/intel: Rework intel_cpu_collect_info()
Nothing needs struct ucode_cpu_info. Make it take struct cpu_signature,
let it return a boolean and simplify the implementation. Rename it now
that the silly name clash with collect_cpu_info() is gone.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211722.851573238@linutronix.de
2023-10-24 15:05:53 +02:00
Thomas Gleixner
3973718cff x86/microcode/intel: Unify microcode apply() functions
Deduplicate the early and late apply() functions.

  [ bp: Rename the function which does the actual application to
      __apply_microcode() to differentiate it from
      microcode_ops.apply_microcode(). ]

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231017211722.795508212@linutronix.de
2023-10-24 15:05:53 +02:00
Thomas Gleixner
f24f204405 x86/microcode/intel: Switch to kvmalloc()
Microcode blobs are getting larger and might soon reach the kmalloc()
limit. Switch over kvmalloc().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.564323243@linutronix.de
2023-10-24 15:05:53 +02:00
Thomas Gleixner
2a1dada3d1 x86/microcode/intel: Save the microcode only after a successful late-load
There are situations where the late microcode is loaded into memory but
is not applied:

  1) The rendezvous fails
  2) The microcode is rejected by the CPUs

If any of this happens then the pointer which was updated at firmware
load time is stale and subsequent CPU hotplug operations either fail to
update or create inconsistent microcode state.

Save the loaded microcode in a separate pointer before the late load is
attempted and when successful, update the hotplug pointer accordingly
via a new microcode_ops callback.

Remove the pointless fallback in the loader to a microcode pointer which
is never populated.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.505491309@linutronix.de
2023-10-24 15:05:53 +02:00
Thomas Gleixner
dd5e3e3ca6 x86/microcode/intel: Simplify early loading
The early loading code is overly complicated:

  - It scans the builtin/initrd for microcode not only on the BSP, but also
    on all APs during early boot and then later in the boot process it
    scans again to duplicate and save the microcode before initrd goes
    away.

    That's a pointless exercise because this can be simply done before
    bringing up the APs when the memory allocator is up and running.

 - Saving the microcode from within the scan loop is completely
   non-obvious and a left over of the microcode cache.

   This can be done at the call site now which makes it obvious.

Rework the code so that only the BSP scans the builtin/initrd microcode
once during early boot and save it away in an early initcall for later
use.

  [ bp: Test and fold in a fix from tglx ontop which handles the need to
    distinguish what save_microcode() does depending on when it is
    called:

     - when on the BSP during early load, it needs to find a newer
       revision than the one currently loaded on the BSP

     - later, before SMP init, it still runs on the BSP and gets the BSP
       revision just loaded and uses that revision to know which patch
       to save for the APs. For that it needs to find the exact one as
       on the BSP.
   ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211722.629085215@linutronix.de
2023-10-24 15:02:36 +02:00
Thomas Gleixner
0177669ee6 x86/microcode/intel: Cleanup code further
Sanitize the microcode scan loop, fixup printks and move the loading
function for builtin microcode next to the place where it is used and mark
it __init.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.389400871@linutronix.de
2023-10-19 14:10:50 +02:00
Thomas Gleixner
6b072022ab x86/microcode/intel: Simplify and rename generic_load_microcode()
so it becomes less obfuscated and rename it because there is nothing
generic about it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.330295409@linutronix.de
2023-10-19 14:10:00 +02:00
Thomas Gleixner
b0f0bf5eef x86/microcode/intel: Simplify scan_microcode()
Make it readable and comprehensible.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231002115902.271940980@linutronix.de
2023-10-19 12:32:40 +02:00
Ashok Raj
ae76d951f6 x86/microcode/intel: Rip out mixed stepping support for Intel CPUs
Mixed steppings aren't supported on Intel CPUs. Only one microcode patch
is required for the entire system. The caching of microcode blobs which
match the family and model is therefore pointless and in fact is
dysfunctional as CPU hotplug updates use only a single microcode blob,
i.e. the one where *intel_ucode_patch points to.

Remove the microcode cache and make it an AMD local feature.

  [ tglx:
     - save only at the end. Otherwise random microcode ends up in the
  	  pointer for early loading
     - free the ucode patch pointer in save_microcode_patch() only
    after kmemdup() has succeeded, as reported by Andrew Cooper ]

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211722.404362809@linutronix.de
2023-10-19 12:29:39 +02:00
Thomas Gleixner
0b62f6cb07 x86/microcode/32: Move early loading after paging enable
32-bit loads microcode before paging is enabled. The commit which
introduced that has zero justification in the changelog. The cover
letter has slightly more content, but it does not give any technical
justification either:

  "The problem in current microcode loading method is that we load a
   microcode way, way too late; ideally we should load it before turning
   paging on.  This may only be practical on 32 bits since we can't get
   to 64-bit mode without paging on, but we should still do it as early
   as at all possible."

Handwaving word salad with zero technical content.

Someone claimed in an offlist conversation that this is required for
curing the ATOM erratum AAE44/AAF40/AAG38/AAH41. That erratum requires
an microcode update in order to make the usage of PSE safe. But during
early boot, PSE is completely irrelevant and it is evaluated way later.

Neither is it relevant for the AP on single core HT enabled CPUs as the
microcode loading on the AP is not doing anything.

On dual core CPUs there is a theoretical problem if a split of an
executable large page between enabling paging including PSE and loading
the microcode happens. But that's only theoretical, it's practically
irrelevant because the affected dual core CPUs are 64bit enabled and
therefore have paging and PSE enabled before loading the microcode on
the second core. So why would it work on 64-bit but not on 32-bit?

The erratum:

  "AAG38 Code Fetch May Occur to Incorrect Address After a Large Page is
   Split Into 4-Kbyte Pages

   Problem: If software clears the PS (page size) bit in a present PDE
   (page directory entry), that will cause linear addresses mapped through
   this PDE to use 4-KByte pages instead of using a large page after old
   TLB entries are invalidated. Due to this erratum, if a code fetch uses
   this PDE before the TLB entry for the large page is invalidated then it
   may fetch from a different physical address than specified by either the
   old large page translation or the new 4-KByte page translation. This
   erratum may also cause speculative code fetches from incorrect addresses."

The practical relevance for this is exactly zero because there is no
splitting of large text pages during early boot-time, i.e. between paging
enable and microcode loading, and neither during CPU hotplug.

IOW, this load microcode before paging enable is yet another voodoo
programming solution in search of a problem. What's worse is that it causes
at least two serious problems:

 1) When stackprotector is enabled, the microcode loader code has the
    stackprotector mechanics enabled. The read from the per CPU variable
    __stack_chk_guard is always accessing the virtual address either
    directly on UP or via %fs on SMP. In physical address mode this
    results in an access to memory above 3GB. So this works by chance as
    the hardware returns the same value when there is no RAM at this
    physical address. When there is RAM populated above 3G then the read
    is by chance the same as nothing changes that memory during the very
    early boot stage. That's not necessarily true during runtime CPU
    hotplug.

 2) When function tracing is enabled, the relevant microcode loader
    functions and the functions invoked from there will call into the
    tracing code and evaluate global and per CPU variables in physical
    address mode. What could potentially go wrong?

Cure this and move the microcode loading after the early paging enable, use
the new temporary initrd mapping and remove the gunk in the microcode
loader which is required to handle physical address mode.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20231017211722.348298216@linutronix.de
2023-10-18 22:15:01 +02:00
Thomas Gleixner
d2700f4067 x86/microcode/intel: Remove pointless mutex
There is no concurrency.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195728.069849788@linutronix.de
2023-08-13 18:42:55 +02:00
Thomas Gleixner
d44450c593 x86/microcode/intel: Remove debug code
This is really of dubious value.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195728.010895747@linutronix.de
2023-08-13 18:42:55 +02:00
Thomas Gleixner
d02a0efd0f x86/microcode: Move core specific defines to local header
There is no reason to expose all of this globally. Move everything which is
not required outside of the microcode specific code to local header files
and into the respective source files.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195727.952876381@linutronix.de
2023-08-13 18:42:55 +02:00
Ashok Raj
b0e67db12d x86/microcode/intel: Rename get_datasize() since its used externally
Rename get_datasize() to intel_microcode_get_datasize() and make it an inline.

  [ tglx: Make the argument typed and fix up the IFS code ]

Suggested-by: Boris Petkov <bp@alien8.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195727.894165745@linutronix.de
2023-08-13 18:42:55 +02:00
Ashok Raj
82ad097b02 x86/microcode: Include vendor headers into microcode.h
Currently vendor specific headers are included explicitly when used in
common code. Instead, include the vendor specific headers in
microcode.h, and include that in all usages.

No functional change.

Suggested-by: Boris Petkov <bp@alien8.de>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195727.776541545@linutronix.de
2023-08-13 18:42:55 +02:00
Thomas Gleixner
4da2131fac x86/microcode/intel: Move microcode functions out of cpu/intel.c
There is really no point to have that in the CPUID evaluation code. Move it
into the Intel-specific microcode handling along with the data
structures, defines and helpers required by it. The exports need to stay
for IFS.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230812195727.719202319@linutronix.de
2023-08-13 18:42:48 +02:00
Ashok Raj
a9a5cac225 x86/microcode/intel: Print old and new revision during early boot
Make early loading message match late loading message and print both old
and new revisions.

This is helpful to know what the BIOS loaded revision is before an early
update.

Cache the early BIOS revision before the microcode update and have
print_ucode_info() print both the old and new revision in the same
format as microcode_reload_late().

  [ bp: Massage, remove useless comment. ]

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230120161923.118882-6-ashok.raj@intel.com
2023-01-21 14:55:21 +01:00
Ashok Raj
174f1b909a x86/microcode/intel: Pass the microcode revision to print_ucode_info() directly
print_ucode_info() takes a struct ucode_cpu_info pointer as parameter.
Its sole purpose is to print the microcode revision.

The only available ucode_cpu_info always describes the currently loaded
microcode revision. After a microcode update is successful, this is the
new revision, or on failure it is the original revision.

In preparation for future changes, replace the struct ucode_cpu_info
pointer parameter with a plain integer which contains the revision
number and adjust the call sites accordingly.

No functional change.

  [ bp:
    - Fix + cleanup commit message.
    - Revert arbitrary, unrelated change.
  ]

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230120161923.118882-5-ashok.raj@intel.com
2023-01-21 14:55:20 +01:00
Linus Torvalds
a70210f415 Merge tag 'x86_microcode_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 microcode and IFS updates from Borislav Petkov:
 "The IFS (In-Field Scan) stuff goes through tip because the IFS driver
  uses the same structures and similar functionality as the microcode
  loader and it made sense to route it all through this branch so that
  there are no conflicts.

   - Add support for multiple testing sequences to the Intel In-Field
     Scan driver in order to be able to run multiple different test
     patterns. Rework things and remove the BROKEN dependency so that
     the driver can be enabled (Jithu Joseph)

   - Remove the subsys interface usage in the microcode loader because
     it is not really needed

   - A couple of smaller fixes and cleanups"

* tag 'x86_microcode_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/microcode/intel: Do not retry microcode reloading on the APs
  x86/microcode/intel: Do not print microcode revision and processor flags
  platform/x86/intel/ifs: Add missing kernel-doc entry
  Revert "platform/x86/intel/ifs: Mark as BROKEN"
  Documentation/ABI: Update IFS ABI doc
  platform/x86/intel/ifs: Add current_batch sysfs entry
  platform/x86/intel/ifs: Remove reload sysfs entry
  platform/x86/intel/ifs: Add metadata validation
  platform/x86/intel/ifs: Use generic microcode headers and functions
  platform/x86/intel/ifs: Add metadata support
  x86/microcode/intel: Use a reserved field for metasize
  x86/microcode/intel: Add hdr_type to intel_microcode_sanity_check()
  x86/microcode/intel: Reuse microcode_sanity_check()
  x86/microcode/intel: Use appropriate type in microcode_sanity_check()
  x86/microcode/intel: Reuse find_matching_signature()
  platform/x86/intel/ifs: Remove memory allocation from load path
  platform/x86/intel/ifs: Remove image loading during init
  platform/x86/intel/ifs: Return a more appropriate error code
  platform/x86/intel/ifs: Remove unused selection
  x86/microcode: Drop struct ucode_cpu_info.valid
  ...
2022-12-13 15:05:29 -08:00
Ashok Raj
be1b670f61 x86/microcode/intel: Do not retry microcode reloading on the APs
The retries in load_ucode_intel_ap() were in place to support systems
with mixed steppings. Mixed steppings are no longer supported and there is
only one microcode image at a time. Any retries will simply reattempt to
apply the same image over and over without making progress.

  [ bp: Zap the circumstantial reasoning from the commit message. ]

Fixes: 06b8534cb7 ("x86/microcode: Rework microcode loading")
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221129210832.107850-3-ashok.raj@intel.com
2022-12-05 21:22:21 +01:00