Commit Graph

825534 Commits

Author SHA1 Message Date
Sebastian Andrzej Siewior
383c252545 x86/entry: Add TIF_NEED_FPU_LOAD
Add TIF_NEED_FPU_LOAD. This flag is used for loading the FPU registers
before returning to userland. It must not be set on systems without a
FPU.

If this flag is cleared, the CPU's FPU registers hold the latest,
up-to-date content of the current task's (current()) FPU registers.
The in-memory copy (union fpregs_state) is not valid.

If this flag is set, then all of CPU's FPU registers may hold a random
value (except for PKRU) and it is required to load the content of the
FPU registers on return to userland.

Introduce it now as a preparatory change before adding the main feature.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-17-bigeasy@linutronix.de
2019-04-11 16:21:51 +02:00
Rik van Riel
0cecca9d03 x86/fpu: Eager switch PKRU state
While most of a task's FPU state is only needed in user space, the
protection keys need to be in place immediately after a context switch.

The reason is that any access to userspace memory while running in
kernel mode also needs to abide by the memory permissions specified in
the protection keys.

The "eager switch" is a preparation for loading the FPU state on return
to userland. Instead of decoupling PKRU state from xstate, update PKRU
within xstate on write operations by the kernel.

For user tasks the PKRU should be always read from the xsave area and it
should not change anything because the PKRU value was loaded as part of
FPU restore.

For kernel threads the default "init_pkru_value" will be written. Before
this commit, the kernel thread would end up with a random value which it
inherited from the previous user task.

 [ bigeasy: save pkru to xstate, no cache, don't use __raw_xsave_addr() ]

 [ bp: update commit message, sort headers properly in asm/fpu/xstate.h ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-16-bigeasy@linutronix.de
2019-04-11 15:57:10 +02:00
Sebastian Andrzej Siewior
0556cbdc2f x86/pkeys: Don't check if PKRU is zero before writing it
write_pkru() checks if the current value is the same as the expected
value. So instead of just checking if the current and new value is zero
(and skip the write in such a case) we can benefit from that.

Remove the zero check of PKRU, __write_pkru() provides such a check now.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-15-bigeasy@linutronix.de
2019-04-11 15:41:05 +02:00
Sebastian Andrzej Siewior
577ff465f5 x86/fpu: Only write PKRU if it is different from current
According to Dave Hansen, WRPKRU is more expensive than RDPKRU. It has
a higher cycle cost and it's also practically a (light) speculation
barrier.

As an optimisation read the current PKRU value and only write the new
one if it is different.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-14-bigeasy@linutronix.de
2019-04-11 15:41:05 +02:00
Sebastian Andrzej Siewior
c806e88734 x86/pkeys: Provide *pkru() helpers
Dave Hansen asked for __read_pkru() and __write_pkru() to be
symmetrical.

As part of the series __write_pkru() will read back the value and only
write it if it is different.

In order to make both functions symmetrical, move the function
containing only the opcode asm into a function called like the
instruction itself.

__write_pkru() will just invoke wrpkru() but in a follow-up patch will
also read back the value.

 [ bp: Convert asm opcode wrapper names to rd/wrpkru(). ]

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-13-bigeasy@linutronix.de
2019-04-11 15:40:58 +02:00
Sebastian Andrzej Siewior
abd16d68d6 x86/fpu: Use a feature number instead of mask in two more helpers
After changing the argument of __raw_xsave_addr() from a mask to
number Dave suggested to check if it makes sense to do the same for
get_xsave_addr(). As it turns out it does.

Only get_xsave_addr() needs the mask to check if the requested feature
is part of what is supported/saved and then uses the number again. The
shift operation is cheaper compared to fls64() (find last bit set).
Also, the feature number uses less opcode space compared to the mask. :)

Make the get_xsave_addr() argument a xfeature number instead of a mask
and fix up its callers.

Furthermore, use xfeature_nr and xfeature_mask consistently.

This results in the following changes to the kvm code:

  feature -> xfeature_mask
  index -> xfeature_nr

Suggested-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Siarhei Liakh <Siarhei.Liakh@concurrent-rt.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-12-bigeasy@linutronix.de
2019-04-10 18:20:27 +02:00
Sebastian Andrzej Siewior
07baeb04f3 x86/fpu: Make __raw_xsave_addr() use a feature number instead of mask
Most users of __raw_xsave_addr() use a feature number, shift it to a
mask and then __raw_xsave_addr() shifts it back to the feature number.

Make __raw_xsave_addr() use the feature number as an argument.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-11-bigeasy@linutronix.de
2019-04-10 16:33:45 +02:00
Rik van Riel
4ee91519e1 x86/fpu: Add an __fpregs_load_activate() internal helper
Add a helper function that ensures the floating point registers for the
current task are active. Use with preemption disabled.

While at it, add fpregs_lock/unlock() helpers too, to be used in later
patches.

 [ bp: Add a comment about its intended usage. ]

Signed-off-by: Rik van Riel <riel@surriel.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-10-bigeasy@linutronix.de
2019-04-10 16:23:14 +02:00
Sebastian Andrzej Siewior
0169f53e0d x86/fpu: Remove user_fpu_begin()
user_fpu_begin() sets fpu_fpregs_owner_ctx to task's fpu struct. This is
always the case since there is no lazy FPU anymore.

fpu_fpregs_owner_ctx is used during context switch to decide if it needs
to load the saved registers or if the currently loaded registers are
valid. It could be skipped during a

  taskA -> kernel thread -> taskA

switch because the switch to the kernel thread would not alter the CPU's
sFPU tate.

Since this field is always updated during context switch and
never invalidated, setting it manually (in user context) makes no
difference. A kernel thread with kernel_fpu_begin() block could
set fpu_fpregs_owner_ctx to NULL but a kernel thread does not use
user_fpu_begin().

This is a leftover from the lazy-FPU time.

Remove user_fpu_begin(), it does not change fpu_fpregs_owner_ctx's
content.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-9-bigeasy@linutronix.de
2019-04-10 15:58:44 +02:00
Sebastian Andrzej Siewior
2722146eb7 x86/fpu: Remove fpu->initialized
The struct fpu.initialized member is always set to one for user tasks
and zero for kernel tasks. This avoids saving/restoring the FPU
registers for kernel threads.

The ->initialized = 0 case for user tasks has been removed in previous
changes, for instance, by doing an explicit unconditional init at fork()
time for FPU-less systems which was otherwise delayed until the emulated
opcode.

The context switch code (switch_fpu_prepare() + switch_fpu_finish())
can't unconditionally save/restore registers for kernel threads. Not
only would it slow down the switch but also load a zeroed xcomp_bv for
XSAVES.

For kernel_fpu_begin() (+end) the situation is similar: EFI with runtime
services uses this before alternatives_patched is true. Which means that
this function is used too early and it wasn't the case before.

For those two cases, use current->mm to distinguish between user and
kernel thread. For kernel_fpu_begin() skip save/restore of the FPU
registers.

During the context switch into a kernel thread don't do anything. There
is no reason to save the FPU state of a kernel thread.

The reordering in __switch_to() is important because the current()
pointer needs to be valid before switch_fpu_finish() is invoked so ->mm
is seen of the new task instead the old one.

N.B.: fpu__save() doesn't need to check ->mm because it is called by
user tasks only.

 [ bp: Massage. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-8-bigeasy@linutronix.de
2019-04-10 15:42:40 +02:00
Sebastian Andrzej Siewior
39388e80f9 x86/fpu: Don't save fxregs for ia32 frames in copy_fpstate_to_sigframe()
In commit

  72a671ced6 ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels")

the 32bit and 64bit path of the signal delivery code were merged.

The 32bit version:

  int save_i387_xstate_ia32(void __user *buf)
  …
         if (cpu_has_xsave)
                 return save_i387_xsave(fp);
         if (cpu_has_fxsr)
                 return save_i387_fxsave(fp);

The 64bit version:

  int save_i387_xstate(void __user *buf)
  …
         if (user_has_fpu()) {
                 if (use_xsave())
                         err = xsave_user(buf);
                 else
                         err = fxsave_user(buf);

                 if (unlikely(err)) {
                         __clear_user(buf, xstate_size);
                         return err;

The merge:

  int save_xstate_sig(void __user *buf, void __user *buf_fx, int size)
  …
         if (user_has_fpu()) {
                 /* Save the live register state to the user directly. */
                 if (save_user_xstate(buf_fx))
                         return -1;
                 /* Update the thread's fxstate to save the fsave header. */
                 if (ia32_fxstate)
                         fpu_fxsave(&tsk->thread.fpu);

I don't think that we needed to save the FPU registers to ->thread.fpu
because the registers were stored in buf_fx. Today the state will be
restored from buf_fx after the signal was handled (I assume that this
was also the case with lazy-FPU).

Since commit

  66463db4fc ("x86, fpu: shift drop_init_fpu() from save_xstate_sig() to handle_signal()")

it is ensured that the signal handler starts with clear/fresh set of FPU
registers which means that the previous store is futile.

Remove the copy_fxregs_to_kernel() call because task's FPU state is
cleared later in handle_signal() via fpu__clear().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-7-bigeasy@linutronix.de
2019-04-10 14:46:35 +02:00
Sebastian Andrzej Siewior
fbcc9e0c37 x86/fpu: Remove fpu->initialized usage in copy_fpstate_to_sigframe()
With lazy-FPU support the (now named variable) ->initialized was set to
true if the CPU's FPU registers were holding a valid state of the
FPU registers for the active process. If it was set to false then the
FPU state was saved in fpu->state and the FPU was deactivated.

With lazy-FPU gone, ->initialized is always true for user threads and
kernel threads never call this function so ->initialized is always true
in copy_fpstate_to_sigframe().

The using_compacted_format() check is also a leftover from the lazy-FPU
time. In the

  ->initialized == false

case copy_to_user() would copy the compacted buffer while userland would
expect the non-compacted format instead. So in order to save the FPU
state in the non-compacted form it issues XSAVE to save the *current*
FPU state.

If the FPU is not enabled, the attempt raises the FPU trap, the trap
restores the FPU contents and re-enables the FPU and XSAVE is invoked
again and succeeds.

*This* does not longer work since commit

  bef8b6da95 ("x86/fpu: Handle #NM without FPU emulation as an error")

Remove the check for ->initialized because it is always true and remove
the false condition. Update the comment to reflect that the state is
always live.

 [ bp: Massage. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-6-bigeasy@linutronix.de
2019-04-09 20:48:11 +02:00
Sebastian Andrzej Siewior
88f5260a3b x86/fpu: Always init the state in fpu__clear()
fpu__clear() only initializes the state if the CPU has FPU support.
This initialisation is also required for FPU-less systems and takes
place in math_emulate(). Since fpu__initialize() only performs the
initialization if ->initialized is zero it does not matter that it
is invoked each time an opcode is emulated. It makes the removal of
->initialized easier if the struct is also initialized in the FPU-less
case at the same time.

Move fpu__initialize() before the FPU feature check so it is also
performed in the FPU-less case too.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Bill Metzenthen <billm@melbpc.org.au>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-5-bigeasy@linutronix.de
2019-04-09 19:28:06 +02:00
Sebastian Andrzej Siewior
60e528d6ce x86/fpu: Remove preempt_disable() in fpu__clear()
The preempt_disable() section was introduced in commit

  a10b6a16cd ("x86/fpu: Make the fpu state change in fpu__clear() scheduler-atomic")

and it was said to be temporary.

fpu__initialize() initializes the FPU struct to its initial value and
then sets ->initialized to 1. The last part is the important one.
The content of the state does not matter because it gets set via
copy_init_fpstate_to_fpregs().

A preemption here has little meaning because the registers will always be
set to the same content after copy_init_fpstate_to_fpregs(). A softirq
with a kernel_fpu_begin() could also force to save FPU's registers after
fpu__initialize() without changing the outcome here.

Remove the preempt_disable() section in fpu__clear(), preemption here
does not hurt.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-4-bigeasy@linutronix.de
2019-04-09 19:27:46 +02:00
Sebastian Andrzej Siewior
6dd677a044 x86/fpu: Remove fpu__restore()
There are no users of fpu__restore() so it is time to remove it. The
comment regarding fpu__restore() and TS bit is stale since commit

  b3b0870ef3 ("i387: do not preload FPU state at task switch time")

and has no meaning since.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aubrey Li <aubrey.li@intel.com>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Dmitry Safonov <dima@arista.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: linux-doc@vger.kernel.org
Cc: Nicolai Stange <nstange@suse.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-3-bigeasy@linutronix.de
2019-04-09 19:27:42 +02:00
Sebastian Andrzej Siewior
39ea9baffd x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig()
This is a preparation for the removal of the ->initialized member in the
fpu struct.

__fpu__restore_sig() is deactivating the FPU via fpu__drop() and then
setting manually ->initialized followed by fpu__restore(). The result is
that it is possible to manipulate fpu->state and the state of registers
won't be saved/restored on a context switch which would overwrite
fpu->state:

fpu__drop(fpu):
  ...
  fpu->initialized = 0;
  preempt_enable();

  <--- context switch

Don't access the fpu->state while the content is read from user space
and examined/sanitized. Use a temporary kmalloc() buffer for the
preparation of the FPU registers and once the state is considered okay,
load it. Should something go wrong, return with an error and without
altering the original FPU registers.

The removal of fpu__initialize() is a nop because fpu->initialized is
already set for the user task.

 [ bp: Massage a bit. ]

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190403164156.19645-2-bigeasy@linutronix.de
2019-04-09 19:27:29 +02:00
Jann Horn
89833fab15 x86/fpu: Fix __user annotations
In save_xstate_epilog(), use __user when type-casting userspace
pointers.

In setup_sigcontext() and x32_setup_rt_frame(), cast the userspace
pointers to 'unsigned long __user *' before writing into them. These
pointers are originally '__u32 __user *' or '__u64 __user *', causing
sparse to complain when a userspace pointer is written into them. The
casts are okay because the pointers always point to pointer-sized
values.

Thanks to Luc Van Oostenryck and Al Viro for explaining this to me.

Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190329214652.258477-3-jannh@google.com
2019-04-03 14:12:40 +02:00
Linus Torvalds
79a3aaa7b8 Linux 5.1-rc3 2019-03-31 14:39:29 -07:00
Linus Torvalds
63fc9c2348 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
 "A collection of x86 and ARM bugfixes, and some improvements to
  documentation.

  On top of this, a cleanup of kvm_para.h headers, which were exported
  by some architectures even though they not support KVM at all. This is
  responsible for all the Kbuild changes in the diffstat"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
  Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION
  KVM: doc: Document the life cycle of a VM and its resources
  KVM: selftests: complete IO before migrating guest state
  KVM: selftests: disable stack protector for all KVM tests
  KVM: selftests: explicitly disable PIE for tests
  KVM: selftests: assert on exit reason in CR4/cpuid sync test
  KVM: x86: update %rip after emulating IO
  x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init
  kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs
  KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts
  kvm: don't redefine flags as something else
  kvm: mmu: Used range based flushing in slot_handle_level_range
  KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported
  KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region()
  kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields
  KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)
  KVM: Reject device ioctls from processes other than the VM's creator
  KVM: doc: Fix incorrect word ordering regarding supported use of APIs
  KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'
  KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT
  ...
2019-03-31 08:55:59 -07:00
Linus Torvalds
915ee0da5e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A pile of x86 updates:

   - Prevent exceeding he valid physical address space in the /dev/mem
     limit checks.

   - Move all header content inside the header guard to prevent compile
     failures.

   - Fix the bogus __percpu annotation in this_cpu_has() which makes
     sparse very noisy.

   - Disable switch jump tables completely when retpolines are enabled.

   - Prevent leaking the trampoline address"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/realmode: Make set_real_mode_mem() static inline
  x86/cpufeature: Fix __percpu annotation in this_cpu_has()
  x86/mm: Don't exceed the valid physical address space
  x86/retpolines: Disable switch jump tables when retpolines are enabled
  x86/realmode: Don't leak the trampoline kernel address
  x86/boot: Fix incorrect ifdeffery scope
  x86/resctrl: Remove unused variable
2019-03-31 08:40:15 -07:00
Linus Torvalds
590627f755 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
 "Core libraries:
   - Fix max perf_event_attr.precise_ip detection.
   - Fix parser error for uncore event alias
   - Fixup ordering of kernel maps after obtaining the main kernel map
     address.

  Intel PT:
   - Fix TSC slip where A TSC packet can slip past MTC packets so that
     the timestamp appears to go backwards.
   - Fixes for exported-sql-viewer GUI conversion to python3.

  ARM coresight:
   - Fix the build by adding a missing case value for enumeration value
     introduced in newer library, that now is the required one.

  tool headers:
   - Syncronize kernel headers with the kernel, getting new io_uring and
     pidfd_send_signal syscalls so that 'perf trace' can handle them"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf pmu: Fix parser error for uncore event alias
  perf scripts python: exported-sql-viewer.py: Fix python3 support
  perf scripts python: exported-sql-viewer.py: Fix never-ending loop
  perf machine: Update kernel map address and re-order properly
  tools headers uapi: Sync powerpc's asm/kvm.h copy with the kernel sources
  tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd
  tools headers uapi: Update drm/i915_drm.h
  tools arch x86: Sync asm/cpufeatures.h with the kernel sources
  tools headers uapi: Sync linux/fcntl.h to get the F_SEAL_FUTURE_WRITE addition
  tools headers uapi: Sync asm-generic/mman-common.h and linux/mman.h
  perf evsel: Fix max perf_event_attr.precise_ip detection
  perf intel-pt: Fix TSC slip
  perf cs-etm: Add missing case value
2019-03-31 08:37:04 -07:00
Linus Torvalds
c29d85417c Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fixes from Thomas Gleixner:
 "Two SMT/hotplug related fixes:

   - Prevent crash when HOTPLUG_CPU is disabled and the CPU bringup
     aborts. This is triggered with the 'nosmt' command line option, but
     can happen by any abort condition. As the real unplug code is not
     compiled in, prevent the fail by keeping the CPU in zombie state.

   - Enforce HOTPLUG_CPU for SMP on x86 to avoid the above situation
     completely. With 'nosmt' being a popular option it's required to
     unplug the half brought up sibling CPUs (due to the MCE wreckage)
     completely"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y
  cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n
2019-03-31 08:22:12 -07:00
Linus Torvalds
573efdc5ea Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixlet from Thomas Gleixner:
 "Trivial update to the maintainers file"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Remove deleted file from futex file pattern
2019-03-31 07:48:58 -07:00
Linus Torvalds
f78b5be2a5 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
 "A small set of core updates:

   - Make the watchdog respect the selected CPU mask again. That was
     broken by the rework of the watchdog thread management and caused
     inconsistent state and NMI watchdog being unstoppable.

   - Ensure that the objtool build can find the libelf location.

   - Remove dead kcore stub code"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  watchdog: Respect watchdog cpumask on CPU hotplug
  objtool: Query pkg-config for libelf location
  proc/kcore: Remove unused kclist_add_remap()
2019-03-31 07:47:21 -07:00
Linus Torvalds
6536c5f2c8 Merge tag 'powerpc-5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "Three non-regression fixes.

   - Our optimised memcmp could read past the end of one of the buffers
     and potentially trigger a page fault leading to an oops.

   - Some of our code to read energy management data on PowerVM had an
     endian bug leading to bogus results.

   - When reporting a machine check exception we incorrectly reported
     TLB multihits as D-Cache multhits due to a missing entry in the
     array of causes.

  Thanks to: Chandan Rajendra, Gautham R. Shenoy, Mahesh Salgaonkar,
  Segher Boessenkool, Vaidyanathan Srinivasan"

* tag 'powerpc-5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries/mce: Fix misleading print for TLB mutlihit
  powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes
  powerpc/64: Fix memcmp reading past the end of src/dest
2019-03-31 07:44:13 -07:00