During later stages of math-emu bootup the following crash triggers:
math_emulate: 0060:c100d0a8
Kernel panic - not syncing: Math emulation needed in kernel
CPU: 0 PID: 1511 Comm: login Not tainted 4.2.0-rc7+ #1012
[...]
Call Trace:
[<c181d50d>] dump_stack+0x41/0x52
[<c181c918>] panic+0x77/0x189
[<c1003530>] ? math_error+0x140/0x140
[<c164c2d7>] math_emulate+0xba7/0xbd0
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c1109c3c>] ? __alloc_pages_nodemask+0x12c/0x870
[<c136ac20>] ? proc_clear_tty+0x40/0x70
[<c136ac6e>] ? session_clear_tty+0x1e/0x30
[<c1003530>] ? math_error+0x140/0x140
[<c1003575>] do_device_not_available+0x45/0x70
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c18258e6>] error_code+0x5a/0x60
[<c1003530>] ? math_error+0x140/0x140
[<c100d0a8>] ? fpu__copy+0x138/0x1c0
[<c100c205>] arch_dup_task_struct+0x25/0x30
[<c1048cea>] copy_process.part.51+0xea/0x1480
[<c115a8e5>] ? dput+0x175/0x200
[<c136af70>] ? no_tty+0x30/0x30
[<c1157242>] ? do_vfs_ioctl+0x322/0x540
[<c104a21a>] _do_fork+0xca/0x340
[<c1057b06>] ? SyS_rt_sigaction+0x66/0x90
[<c104a557>] SyS_clone+0x27/0x30
[<c1824a80>] sysenter_do_call+0x12/0x12
The reason is the incorrect assumption in fpu_copy(), that FNSAVE
can be executed from math-emu kernels as well.
Don't try to copy the registers, the soft state will be copied
by fork anyway, so the child task inherits the parent task's
soft math state.
With this fix applied math-emu kernels boot up fine on modern
hardware and the 'no387 nofxsr' boot options.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On a math-emu bootup the following crash occurs:
Initializing CPU#0
------------[ cut here ]------------
kernel BUG at arch/x86/kernel/traps.c:779!
invalid opcode: 0000 [#1] SMP
[...]
EIP is at do_device_not_available+0xe/0x70
[...]
Call Trace:
[<c18238e6>] error_code+0x5a/0x60
[<c1002bd0>] ? math_error+0x140/0x140
[<c100bbd9>] ? fpu__init_cpu+0x59/0xa0
[<c1012322>] cpu_init+0x202/0x330
[<c104509f>] ? __native_set_fixmap+0x1f/0x30
[<c1b56ab0>] trap_init+0x305/0x346
[<c1b548af>] start_kernel+0x1a5/0x35d
[<c1b542b4>] i386_start_kernel+0x82/0x86
The reason is that in the following commit:
b1276c48e9 ("x86/fpu: Initialize fpregs in fpu__init_cpu_generic()")
I failed to consider math-emu's limitation that it cannot execute the
FNINIT instruction in kernel mode.
The long term fix might be to allow math-emu to execute (certain) kernel
mode FPU instructions, but for now apply the safe (albeit somewhat ugly)
fix: initialize the emulation state explicitly without trapping out to
the FPU emulator.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Complete the set of dependent features that need disabling at
once: XSAVEC, AVX-512 and all currently known to the kernel
extensions to it, as well as MPX need to be disabled too.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/55ACC40D0200007800092E6C@mail.emea.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Don't burden architectures without dynamic task_struct sizing
with the overhead of dynamic sizing.
Also optimize the x86 code a bit by caching task_struct_size.
Acked-and-Tested-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1437128892-9831-3-git-send-email-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The FPU rewrite removed the dynamic allocations of 'struct fpu'.
But, this potentially wastes massive amounts of memory (2k per
task on systems that do not have AVX-512 for instance).
Instead of having a separate slab, this patch just appends the
space that we need to the 'task_struct' which we dynamically
allocate already. This saves from doing an extra slab
allocation at fork().
The only real downside here is that we have to stick everything
and the end of the task_struct. But, I think the
BUILD_BUG_ON()s I stuck in there should keep that from being too
fragile.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1437128892-9831-2-git-send-email-mingo@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jan Kara and Thomas Gleixner reported boot crashes in the FPU
code:
general protection fault: 0000 [#1] SMP
RIP: 0010:[<ffffffff81048a6c>] [<ffffffff81048a6c>] mxcsr_feature_mask_init+0x1c/0x40
2b:* 0f ae 85 00 fe ff ff fxsave -0x200(%rbp)
and bisected it down to the following FPU commit:
91a8c2a5b4 ("x86/fpu: Clean up and fix MXCSR handling")
The reason is that the on-stack FPU registers state variable,
used by the FXSAVE instruction, did not have the required
minimum alignment of 16 bytes, causing the general protection
fault.
This is most likely a GCC bug in older GCC versions, but the
offending commit also added a bogus extra 32-byte alignment
(which GCC ignored too).
So fix this bug by making the variable static again, but also
mark it __initdata this time, because fpu__init_system_mxcsr()
is now an __init function.
Reported-and-bisected-by: Jan Kara <jack@suse.cz>
Reported-bisected-and-tested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20150704075819.GA9201@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I noticed that my MPX tracepoints were producing garbage for the
lower and upper bounds:
mpx_bounds_register_exception: address referenced: 0x00007fffffffccb7 bounds: lower: 0x0 ~upper: 0xffffffffffffffff
mpx_bounds_register_exception: address referenced: 0x00007fffffffccbf bounds: lower: 0x0 ~upper: 0xffffffffffffffff
This is, of course, bogus because 0x00007fffffffccbf is *within*
the bounds. I assumed that my instruction decoder was bad and
went looking at it. But I eventually realized that I was
getting a '0' offset back from xstate_offsets[BNDREGS].
It was being skipped in the initialization, which is obviously
bogus, so remove the extra leaf++.
This also goes an initializes xstate_offsets/sizes[] to -1 so
so that bugs like this will oops instead of silently failing
in interesting ways.
This was introduced by:
39f1acd ("x86/fpu/xstate: Don't assume the first zero xfeatures zero bit means the end")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@sr71.net
Link: http://lkml.kernel.org/r/20150611193400.2E0B00DB@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The MPX code appears is calling a low-level FPU function
(copy_fpregs_to_fpstate()). This function is not able to
be called in all contexts, although it is safe to call
directly in some cases.
Although probably correct, the current code is ugly and
potentially error-prone. So, add a wrapper that calls
the (slightly) higher-level fpu__save() (which is preempt-
safe) and also ensures that we even *have* an FPU context
(in the case that this was called when in lazy FPU mode).
Ingo had this to say about the details about when we need
preemption disabled:
> it's indeed generally unsafe to access/copy FPU registers with preemption enabled,
> for two reasons:
>
> - on older systems that use FSAVE the instruction destroys FPU register
> contents, which has to be handled carefully
>
> - even on newer systems if we copy to FPU registers (which this code doesn't)
> then we don't want a context switch to occur in the middle of it, because a
> context switch will write to the fpstate, potentially overwriting our new data
> with old FPU state.
>
> But it's safe to access FPU registers with preemption enabled in a couple of
> special cases:
>
> - potentially destructively saving FPU registers: the signal handling code does
> this in copy_fpstate_to_sigframe(), because it can rely on the signal restore
> side to restore the original FPU state.
>
> - reading FPU registers on modern systems: we don't do this anywhere at the
> moment, mostly to keep symmetry with older systems where FSAVE is
> destructive.
>
> - initializing FPU registers on modern systems: fpu__clear() does this. Here
> it's safe because we don't copy from the fpstate.
>
> - directly writing FPU registers from user-space memory (!). We do this in
> fpu__restore_sig(), and it's safe because neither context switches nor
> irq-handler FPU use can corrupt the source context of the copy (which is
> user-space memory).
>
> Note that the MPX code's current use of copy_fpregs_to_fpstate() was safe I think,
> because:
>
> - MPX is predicated on eagerfpu, so the destructive F[N]SAVE instruction won't be
> used.
>
> - the code was only reading FPU registers, and was doing it only in places that
> guaranteed that an FPU state was already active (i.e. didn't do it in
> kthreads)
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave@sr71.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Suresh Siddha <sbsiddha@gmail.com>
Cc: bp@alien8.de
Link: http://lkml.kernel.org/r/20150607183700.AA881696@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
get_xsave_addr() assumes that if an xsave bit is present in the
hardware (pcntxt_mask) that it is present in a given xsave
buffer. Due to an bug in the xsave code on all of the systems
that have MPX (and thus all the users of this code), that has
been a true assumption.
But, the bug is getting fixed, so our assumption is not going
to hold any more.
It's quite possible (and normal) for an enabled state to be
present on 'pcntxt_mask', but *not* in 'xstate_bv'. We need
to consult 'xstate_bv'.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20150607183700.1E739B34@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
copy_kernel_to_xregs_booting() has a second parameter that is the mask
of xfeatures that should be copied - but this parameter is always -1.
Simplify the call site of this function, this also makes it more
similar to the function call signature of other copy_kernel_to*regs()
functions.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions
in line with the parameter passing convention of other kernel-to-FPU-registers
copying functions: pass around an in-memory FPU register state pointer,
instead of struct fpu *.
NOTE: This patch also changes the assembly constraint of the FXSAVE-leak
workaround from 'fpu->fpregs_active' to 'fpstate' - but that is fine,
as we only need a valid memory address there for the FILDL instruction.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
None of the copy_kernel_to_*regs() FPU register copying functions are
supposed to fail, and all of them have debugging checks that enforce
this.
Remove their return values and simplify their call sites, which have
redundant error checks and error handling code paths.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bring the __copy_fpstate_to_fpregs() and copy_fpstate_to_fpregs() functions
in line with the naming of other kernel-to-FPU-registers copying functions.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The copy_fpstate_to_fpregs() function is never supposed to fail,
so add a debugging check to its call site in fpu__restore().
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fpu__activate_fpstate_write() is used before ptrace writes to the fpstate
context. Because it expects the modified registers to be reloaded on the
nexts context switch, it's only valid to call this function for stopped
child tasks.
- add a debugging check for this assumption
- remove code that only runs if the current task's FPU state needs
to be saved, which cannot occur here
- update comments to match the implementation
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remaining users of fpu__activate_fpstate() are all places that want to modify
FPU registers, rename the function to fpu__activate_fpstate_write() according
to this usage.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fpu__activate_fpstate_read() is used before FPU registers are
read from the fpstate by ptrace and core dumping.
It's not necessary to unlazy non-current child tasks in this case,
since the reading of registers is non-destructive.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently fpu__activate_fpstate() is used for two distinct purposes:
- read access by ptrace and core dumping, where in the core dumping
case the current task's FPU state may be examined as well.
- write access by ptrace, which modifies FPU registers and expects
the modified registers to be reloaded on the next context switch.
Split out the reading side into fpu__activate_fpstate_read().
( Note that this is just a pure duplication of fpu__activate_fpstate()
for the time being, we'll optimize the new function in the next patch. )
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Bobby Powers <bobbypowers@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bobby Powers reported the following FPU warning during ELF coredumping:
WARNING: CPU: 0 PID: 27452 at arch/x86/kernel/fpu/core.c:324 fpu__activate_stopped+0x8a/0xa0()
This warning unearthed an invalid assumption about fpu__activate_stopped()
that I added in:
67e97fc2ec ("x86/fpu: Rename init_fpu() to fpu__unlazy_stopped() and add debugging check")
the old init_fpu() function had an (intentional but obscure) side effect:
when FPU registers are accessed for the current task, for reading, then
it synchronized live in-register FPU state with the fpstate by saving it.
So fix this bug by saving the FPU if we are the current task. We'll
still warn in fpu__save() if this is called for not yet stopped
child tasks, so the debugging check is still preserved.
Also rename the function to fpu__activate_fpstate(), because it's not
exclusively used for stopped tasks, but for the current task as well.
( Note that this bug calls for a cleaner separation of access-for-read
and access-for-modification FPU methods, but we'll do that in separate
patches. )
Reported-by: Bobby Powers <bobbypowers@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove obsolete comment about __init limitations: in the new code there aren't any.
Also standardize the comment style in the function while at it.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Conflicts:
arch/x86/kernel/i387.c
This commit is conflicting:
e88221c50c ("x86/fpu: Disable XSAVES* support for now")
These functions changed a lot, move the quirk to arch/x86/kernel/fpu/init.c's
fpu__init_system_xstate_size_legacy().
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Explain the functions and also standardize their style
and naming.
No change in functionality.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We had a number of FPU init related boot option handlers
in arch/x86/kernel/cpu/common.c - move them over into
arch/x86/kernel/fpu/init.c to have them all in a
single place.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
There are various internal FPU state debugging checks that never
trigger in practice, but which are useful for FPU code development.
Separate these out into CONFIG_X86_DEBUG_FPU=y, and also add a
couple of new ones.
The size difference is about 0.5K of code on defconfig:
text data bss filename
15028906 2578816 1638400 vmlinux
15029430 2578816 1638400 vmlinux
( Keep this enabled by default until the new FPU code is debugged. )
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This cleans up the call sites and the function a bit,
and also makes it more symmetric with the other high
level FPU state handling functions.
It's still only valid for the current task, as we copy
to the FPU registers of the current CPU.
No change in functionality.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now that all the FPU init function call dependencies are
cleaned up we can propagate __init annotations deeper.
This shrinks the runtime size of the kernel a bit, and
also addresses a few section warnings.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So call setup_xstate_comp() from the xstate init code, not
from the generic fpu__init_system() code.
This allows us to remove the protytype from xstate.h as well.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current xstate code in setup_xstate_features() assumes that
the first zero bit means the end of xfeatures - but that is not
so, the SDM clearly states that an arbitrary set of xfeatures
might be enabled - and it is also clear from the description
of the compaction feature that holes are possible:
"13-6 Vol. 1MANAGING STATE USING THE XSAVE FEATURE SET
[...]
Compacted format. Each state component i (i ≥ 2) is located at a byte
offset from the base address of the XSAVE area based on the XCOMP_BV
field in the XSAVE header:
— If XCOMP_BV[i] = 0, state component i is not in the XSAVE area.
— If XCOMP_BV[i] = 1, the following items apply:
• If XCOMP_BV[j] = 0 for every j, 2 ≤ j < i, state component i is
located at a byte offset 576 from the base address of the XSAVE
area. (This item applies if i is the first bit set in bits 62:2 of
the XCOMP_BV; it implies that state component i is located at the
beginning of the extended region.)
• Otherwise, let j, 2 ≤ j < i, be the greatest value such that
XCOMP_BV[j] = 1. Then state component i is located at a byte offset
X from the location of state component j, where X is the number of
bytes required for state component j as enumerated in
CPUID.(EAX=0DH,ECX=j):EAX. (This item implies that state component i
immediately follows the preceding state component whose bit is set
in XCOMP_BV.)"
So don't assume that the first zero xfeatures bit means the end of
all xfeatures - iterate through all of them.
I'm not aware of hardware that triggers this currently.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
kernel_fpu_begin() is __kernel_fpu_begin() with a preempt_disable().
Move the kernel_fpu_begin() debugging check into __kernel_fpu_begin(),
so that users of __kernel_fpu_begin() may benefit from it as well.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Improve the memory layout of 'struct fpu':
- change ->fpregs_active from 'int' to 'char' - it's just a single flag
and modern x86 CPUs can do efficient byte accesses.
- pack related fields closer to each other: often 'fpu->state' will not be
touched, while the other fields will - so pack them into a group.
Also add comments to each field, describing their purpose, and add
some background information about lazy restores.
Also fix an obsolete, lazy switching related comment in fpu_copy()'s description.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So much of fpu/core.c is the regset code, but it just obscures the generic
FPU state machine logic. Factor out the regset code into fpu/regset.c, where
it can be read in isolation.
This affects one API: fpu__activate_stopped() has to be made available
from the core to fpu/regset.c.
No change in functionality.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fpu/xstate.c has a lot of generic FPU signal frame handling routines,
move them into a separate file: fpu/signal.c.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move restore_init_xstate() next to its sole caller.
Also rename it to copy_init_fpstate_to_fpregs() and add
some comments about what it does.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
So the handling of init_xstate_ctx has a layering violation: both
'struct xsave_struct' and 'union thread_xstate' have a
'struct i387_fxsave_struct' member:
xsave_struct::i387
thread_xstate::fxsave
The handling of init_xstate_ctx is generic, it is used on all
CPUs, with or without XSAVE instruction. So it's confusing how
the generic code passes around and handles an XSAVE specific
format.
What we really want is for init_xstate_ctx to be a proper
fpstate and we use its ::fxsave and ::xsave members, as
appropriate.
Since the xsave_struct::i387 and thread_xstate::fxsave aliases
each other this is not a functional problem.
So implement this, and move init_xstate_ctx to the generic FPU
code in the process.
Also, since init_xstate_ctx is not XSAVE specific anymore,
rename it to init_fpstate, and mark it __read_mostly,
because it's only modified once during bootup, and used
as a reference fpstate later on.
There's no change in functionality.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fpstate_init() only uses fpu->state, so pass that in to it.
This enables the cleanup we will do in the next patch.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Factor out the FPU error code handling code from traps.c and fpu/internal.h
and move them close to each other.
Also convert the helper functions to 'struct fpu *', which further simplifies
them.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Remove various boot quirks that came from the old code.
The new code is cleanly split up into per-system and per-cpu
init sequences, and system init functions are only called once.
Remove the run-once quirks.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Only a few places use the regset definitions, so factor them out.
Also fix related header dependency assumptions.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Most of the FPU does not use them, so split it out and include
them in signal.c and ia32_signal.c
Also fix header file dependency assumption in fpu/core.c.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With recent cleanups and fixes the fpu__reset() and fpu__clear()
functions have become almost identical in functionality: the only
difference is that fpu__reset() assumed that the fpstate
was already active in the eagerfpu case, while fpu__clear()
activated it if it was inactive.
This distinction almost never matters, the only case where such
fpstate activation happens if if the init thread (PID 1) gets exec()-ed
for the first time.
So keep fpu__clear() and change all fpu__reset() uses to
fpu__clear() to simpify the logic.
( In a later patch we'll further simplify fpu__clear() by making
sure that all contexts it is called on are already active. )
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Consolidate more signal frame related functions:
text data bss dec filename
14108070 2575280 1634304 18317654 vmlinux.before
14107944 2575344 1634304 18317592 vmlinux.after
Also, while moving it, rename alloc_mathframe() to fpu__alloc_mathframe().
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
restore_xstate_sig() is a misnomer: it's not limited to 'xstate' at all,
it is the high level 'restore FPU state from a signal frame' function
that works with all legacy FPU formats as well.
Rename it (and its helper) accordingly, and also move it to the
fpu__*() namespace.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Do it like all other high level FPU state handling functions: they
only know about struct fpu, not about the task.
(Also remove a dead prototype while at it.)
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The fpu__*() methods are closely related, but they are defined
in scattered places within the FPU code.
Concentrate them, and also uninline fpu__save(), fpu__drop()
and fpu__reset() to save about 5K of kernel text on 64-bit kernels:
text data bss dec filename
14113063 2575280 1634304 18322647 vmlinux.before
14108070 2575280 1634304 18317654 vmlinux.after
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
fpu_restore_checking() is a helper function of restore_fpu_checking(),
but this is not apparent from the naming.
Both copy fpstate contents to fpregs, while the fuller variant does
a full copy without leaking information.
So rename them to:
copy_fpstate_to_fpregs()
__copy_fpstate_to_fpregs()
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
drop_fpu() and fpu_reset_state() are similar in functionality
and in scope, yet this is not apparent from their names.
drop_fpu() deactivates FPU contents (both the fpregs and the fpstate),
but leaves register contents intact in the eager-FPU case, mostly as an
optimization. It disables fpregs in the lazy FPU case. The drop_fpu()
method can be used to destroy FPU state in an optimized way, when we
know that a new state will be loaded before user-space might see
any remains of the old FPU state:
- such as in sys_exit()'s exit_thread() where we know this task
won't execute any user-space instructions anymore and the
next context switch cleans up the FPU. The old FPU state
might still be around in the eagerfpu case but won't be
saved.
- in __restore_xstate_sig(), where we use drop_fpu() before
copying a new state into the fpstate and activating that one.
No user-pace instructions can execute between those steps.
- in sys_execve()'s fpu__clear(): there we use drop_fpu() in
the !eagerfpu case, where it's equivalent to a full reinit.
fpu_reset_state() is a stronger version of drop_fpu(): both in
the eagerfpu and the lazy-FPU case it guarantees that fpregs
are reinitialized to init state. This method is used in cases
where we need a full reset:
- handle_signal() uses fpu_reset_state() to reset the FPU state
to init before executing a user-space signal handler. While we
have already saved the original FPU state at this point, and
always restore the original state, the signal handling code
still has to do this reinit, because signals may interrupt
any user-space instruction, and the FPU might be in various
intermediate states (such as an unbalanced x87 stack) that is
not immediately usable for general C signal handler code.
- __restore_xstate_sig() uses fpu_reset_state() when the signal
frame has no FP context. Since the signal handler may have
modified the FPU state, it gets reset back to init state.
- in another branch __restore_xstate_sig() uses fpu_reset_state()
to handle a restoration error: when restore_user_xstate() fails
to restore FPU state and we might have inconsistent FPU data,
fpu_reset_state() is used to reset it back to a known good
state.
- __kernel_fpu_end() uses fpu_reset_state() in an error branch.
This is in a 'must not trigger' error branch, so on bug-free
kernels this never triggers.
- fpu__restore() uses fpu_reset_state() in an error path
as well: if the fpstate was set up with invalid FPU state
(via ptrace or via a signal handler), then it's reset back
to init state.
- likewise, the scheduler's switch_fpu_finish() uses it in a
restoration error path too.
Move both drop_fpu() and fpu_reset_state() to the fpu__*() namespace
and harmonize their naming with their function:
fpu__drop()
fpu__reset()
This clearly shows that both methods operate on the full state of the
FPU, just like fpu__restore().
Also add comments to explain what each function does.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>