linux/arch/x86/include/asm/xsave.h

263 lines
6.2 KiB
C
Raw Normal View History

#ifndef __ASM_X86_XSAVE_H
#define __ASM_X86_XSAVE_H
#include <linux/types.h>
#include <asm/processor.h>
#define XSTATE_CPUID 0x0000000d
#define XSTATE_FP 0x1
#define XSTATE_SSE 0x2
#define XSTATE_YMM 0x4
#define XSTATE_BNDREGS 0x8
#define XSTATE_BNDCSR 0x10
#define XSTATE_OPMASK 0x20
#define XSTATE_ZMM_Hi256 0x40
#define XSTATE_Hi16_ZMM 0x80
#define XSTATE_FPSSE (XSTATE_FP | XSTATE_SSE)
/* Bit 63 of XCR0 is reserved for future expansion */
#define XSTATE_EXTEND_MASK (~(XSTATE_FPSSE | (1ULL << 63)))
#define FXSAVE_SIZE 512
#define XSAVE_HDR_SIZE 64
#define XSAVE_HDR_OFFSET FXSAVE_SIZE
#define XSAVE_YMM_SIZE 256
#define XSAVE_YMM_OFFSET (XSAVE_HDR_SIZE + XSAVE_HDR_OFFSET)
/* Supported features which support lazy state saving */
#define XSTATE_LAZY (XSTATE_FP | XSTATE_SSE | XSTATE_YMM \
| XSTATE_OPMASK | XSTATE_ZMM_Hi256 | XSTATE_Hi16_ZMM)
/* Supported features which require eager state saving */
#define XSTATE_EAGER (XSTATE_BNDREGS | XSTATE_BNDCSR)
/* All currently supported features */
#define XCNTXT_MASK (XSTATE_LAZY | XSTATE_EAGER)
#ifdef CONFIG_X86_64
#define REX_PREFIX "0x48, "
#else
#define REX_PREFIX
#endif
extern unsigned int xstate_size;
extern u64 pcntxt_mask;
extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
x86, fpu: use non-lazy fpu restore for processors supporting xsave Fundamental model of the current Linux kernel is to lazily init and restore FPU instead of restoring the task state during context switch. This changes that fundamental lazy model to the non-lazy model for the processors supporting xsave feature. Reasons driving this model change are: i. Newer processors support optimized state save/restore using xsaveopt and xrstor by tracking the INIT state and MODIFIED state during context-switch. This is faster than modifying the cr0.TS bit which has serializing semantics. ii. Newer glibc versions use SSE for some of the optimized copy/clear routines. With certain workloads (like boot, kernel-compilation etc), application completes its work with in the first 5 task switches, thus taking upto 5 #DNA traps with the kernel not getting a chance to apply the above mentioned pre-load heuristic. iii. Some xstate features (like AMD's LWP feature) don't honor the cr0.TS bit and thus will not work correctly in the presence of lazy restore. Non-lazy state restore is needed for enabling such features. Some data on a two socket SNB system: * Saved 20K DNA exceptions during boot on a two socket SNB system. * Saved 50K DNA exceptions during kernel-compilation workload. * Improved throughput of the AVX based checksumming function inside the kernel by ~15% as xsave/xrstor is faster than the serializing clts/stts pair. Also now kernel_fpu_begin/end() relies on the patched alternative instructions. So move check_fpu() which uses the kernel_fpu_begin/end() after alternative_instructions(). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1345842782-24175-7-git-send-email-suresh.b.siddha@intel.com Merge 32-bit boot fix from, Link: http://lkml.kernel.org/r/1347300665-6209-4-git-send-email-suresh.b.siddha@intel.com Cc: Jim Kukunas <james.t.kukunas@linux.intel.com> Cc: NeilBrown <neilb@suse.de> Cc: Avi Kivity <avi@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-08-24 21:13:02 +00:00
extern struct xsave_struct *init_xstate_buf;
extern void xsave_init(void);
extern void update_regset_xstate_info(unsigned int size, u64 xstate_mask);
extern int init_fpu(struct task_struct *child);
/* These macros all use (%edi)/(%rdi) as the single memory argument. */
#define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27"
#define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37"
#define XSAVES ".byte " REX_PREFIX "0x0f,0xc7,0x2f"
#define XRSTOR ".byte " REX_PREFIX "0x0f,0xae,0x2f"
#define XRSTORS ".byte " REX_PREFIX "0x0f,0xc7,0x1f"
#define xstate_fault ".section .fixup,\"ax\"\n" \
"3: movl $-1,%[err]\n" \
" jmp 2b\n" \
".previous\n" \
_ASM_EXTABLE(1b, 3b) \
: [err] "=r" (err)
x86/xsaves: Add xsaves and xrstors support for booting time Since boot_cpu_data and cpu capabilities are not enabled yet during early booting time, alternative can not be used in some functions to access xsave area. Therefore, we define two new functions xrstor_state_booting() and xsave_state_booting() to access xsave area just during early booting time. xrstor_state_booting restores xstate from xsave area during early booting time. xsave_state_booting saves xstate to xsave area during early booting time. The two functions are similar to xrstor_state and xsave_state respectively. But the two functions don't use alternatives because alternatives are not enabled when they are called in such early booting time. xrstor_state_booting is called only by functions defined as __init. So it's defined as __init and will be removed from memory after booting time. There is no extra memory cost caused by this function during running time. But because xsave_state_booting can be called by run-time function __save_fpu(), it's not defined as __init and will stay in memory during running time although it will not be called anymore during running time. It is not ideal to have this function stay in memory during running time. But it's a pretty small function and the memory cost will be small. By doing in this way, we can avoid to change a lot of code to just remove this small function and save a bit memory for running time. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-13-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-05-29 18:12:40 +00:00
/*
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
static inline int xsave_state_booting(struct xsave_struct *fx, u64 mask)
x86/xsaves: Add xsaves and xrstors support for booting time Since boot_cpu_data and cpu capabilities are not enabled yet during early booting time, alternative can not be used in some functions to access xsave area. Therefore, we define two new functions xrstor_state_booting() and xsave_state_booting() to access xsave area just during early booting time. xrstor_state_booting restores xstate from xsave area during early booting time. xsave_state_booting saves xstate to xsave area during early booting time. The two functions are similar to xrstor_state and xsave_state respectively. But the two functions don't use alternatives because alternatives are not enabled when they are called in such early booting time. xrstor_state_booting is called only by functions defined as __init. So it's defined as __init and will be removed from memory after booting time. There is no extra memory cost caused by this function during running time. But because xsave_state_booting can be called by run-time function __save_fpu(), it's not defined as __init and will stay in memory during running time although it will not be called anymore during running time. It is not ideal to have this function stay in memory during running time. But it's a pretty small function and the memory cost will be small. By doing in this way, we can avoid to change a lot of code to just remove this small function and save a bit memory for running time. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1401387164-43416-13-git-send-email-fenghua.yu@intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2014-05-29 18:12:40 +00:00
{
u32 lmask = mask;
u32 hmask = mask >> 32;
int err = 0;
WARN_ON(system_state != SYSTEM_BOOTING);
if (boot_cpu_has(X86_FEATURE_XSAVES))
asm volatile("1:"XSAVES"\n\t"
"2:\n\t"
: : "D" (fx), "m" (*fx), "a" (lmask), "d" (hmask)
: "memory");
else
asm volatile("1:"XSAVE"\n\t"
"2:\n\t"
: : "D" (fx), "m" (*fx), "a" (lmask), "d" (hmask)
: "memory");
asm volatile(xstate_fault
: "0" (0)
: "memory");
return err;
}
/*
* This function is called only during boot time when x86 caps are not set
* up and alternative can not be used yet.
*/
static inline int xrstor_state_booting(struct xsave_struct *fx, u64 mask)
{
u32 lmask = mask;
u32 hmask = mask >> 32;
int err = 0;
WARN_ON(system_state != SYSTEM_BOOTING);
if (boot_cpu_has(X86_FEATURE_XSAVES))
asm volatile("1:"XRSTORS"\n\t"
"2:\n\t"
: : "D" (fx), "m" (*fx), "a" (lmask), "d" (hmask)
: "memory");
else
asm volatile("1:"XRSTOR"\n\t"
"2:\n\t"
: : "D" (fx), "m" (*fx), "a" (lmask), "d" (hmask)
: "memory");
asm volatile(xstate_fault
: "0" (0)
: "memory");
return err;
}
/*
* Save processor xstate to xsave area.
*/
static inline int xsave_state(struct xsave_struct *fx, u64 mask)
{
u32 lmask = mask;
u32 hmask = mask >> 32;
int err = 0;
/*
* If xsaves is enabled, xsaves replaces xsaveopt because
* it supports compact format and supervisor states in addition to
* modified optimization in xsaveopt.
*
* Otherwise, if xsaveopt is enabled, xsaveopt replaces xsave
* because xsaveopt supports modified optimization which is not
* supported by xsave.
*
* If none of xsaves and xsaveopt is enabled, use xsave.
*/
alternative_input_2(
"1:"XSAVE,
"1:"XSAVEOPT,
X86_FEATURE_XSAVEOPT,
"1:"XSAVES,
X86_FEATURE_XSAVES,
[fx] "D" (fx), "a" (lmask), "d" (hmask) :
"memory");
asm volatile("2:\n\t"
xstate_fault
: "0" (0)
: "memory");
return err;
}
/*
* Restore processor xstate from xsave area.
*/
static inline int xrstor_state(struct xsave_struct *fx, u64 mask)
{
int err = 0;
u32 lmask = mask;
u32 hmask = mask >> 32;
/*
* Use xrstors to restore context if it is enabled. xrstors supports
* compacted format of xsave area which is not supported by xrstor.
*/
alternative_input(
"1: " XRSTOR,
"1: " XRSTORS,
X86_FEATURE_XSAVES,
"D" (fx), "m" (*fx), "a" (lmask), "d" (hmask)
: "memory");
asm volatile("2:\n"
xstate_fault
: "0" (0)
: "memory");
return err;
}
/*
* Save xstate context for old process during context switch.
*/
static inline void fpu_xsave(struct fpu *fpu)
{
xsave_state(&fpu->state->xsave, -1);
}
/*
* Restore xstate context for new process during context switch.
*/
static inline int fpu_xrstor_checking(struct xsave_struct *fx)
{
return xrstor_state(fx, -1);
}
/*
* Save xstate to user space xsave area.
*
* We don't use modified optimization because xrstor/xrstors might track
* a different application.
*
* We don't use compacted format xsave area for
* backward compatibility for old applications which don't understand
* compacted format of xsave area.
*/
static inline int xsave_user(struct xsave_struct __user *buf)
{
int err;
/*
* Clear the xsave header first, so that reserved fields are
* initialized to zero.
*/
x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels Currently for x86 and x86_32 binaries, fpstate in the user sigframe is copied to/from the fpstate in the task struct. And in the case of signal delivery for x86_64 binaries, if the fpstate is live in the CPU registers, then the live state is copied directly to the user sigframe. Otherwise fpstate in the task struct is copied to the user sigframe. During restore, fpstate in the user sigframe is restored directly to the live CPU registers. Historically, different code paths led to different bugs. For example, x86_64 code path was not preemption safe till recently. Also there is lot of code duplication for support of new features like xsave etc. Unify signal handling code paths for x86 and x86_64 kernels. New strategy is as follows: Signal delivery: Both for 32/64-bit frames, align the core math frame area to 64bytes as needed by xsave (this where the main fpu/extended state gets copied to and excludes the legacy compatibility fsave header for the 32-bit [f]xsave frames). If the state is live, copy the register state directly to the user frame. If not live, copy the state in the thread struct to the user frame. And for 32-bit [f]xsave frames, construct the fsave header separately before the actual [f]xsave area. Signal return: As the 32-bit frames with [f]xstate has an additional 'fsave' header, copy everything back from the user sigframe to the fpstate in the task structure and reconstruct the fxstate from the 'fsave' header (Also user passed pointers may not be correctly aligned for any attempt to directly restore any partial state). At the next fpstate usage, everything will be restored to the live CPU registers. For all the 64-bit frames and the 32-bit fsave frame, restore the state from the user sigframe directly to the live CPU registers. 64-bit signals always restored the math frame directly, so we can expect the math frame pointer to be correctly aligned. For 32-bit fsave frames, there are no alignment requirements, so we can restore the state directly. "lat_sig catch" microbenchmark numbers (for x86, x86_64, x86_32 binaries) are with in the noise range with this change. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1343171129-2747-4-git-send-email-suresh.b.siddha@intel.com [ Merged in compilation fix ] Link: http://lkml.kernel.org/r/1344544736.8326.17.camel@sbsiddha-desk.sc.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-07-24 23:05:29 +00:00
err = __clear_user(&buf->xsave_hdr, sizeof(buf->xsave_hdr));
if (unlikely(err))
return -EFAULT;
__asm__ __volatile__(ASM_STAC "\n"
"1:"XSAVE"\n"
"2: " ASM_CLAC "\n"
xstate_fault
: "D" (buf), "a" (-1), "d" (-1), "0" (0)
: "memory");
return err;
}
/*
* Restore xstate from user space xsave area.
*/
static inline int xrestore_user(struct xsave_struct __user *buf, u64 mask)
{
int err = 0;
struct xsave_struct *xstate = ((__force struct xsave_struct *)buf);
u32 lmask = mask;
u32 hmask = mask >> 32;
__asm__ __volatile__(ASM_STAC "\n"
"1:"XRSTOR"\n"
"2: " ASM_CLAC "\n"
xstate_fault
: "D" (xstate), "a" (lmask), "d" (hmask), "0" (0)
: "memory"); /* memory required? */
return err;
}
void *get_xsave_addr(struct xsave_struct *xsave, int xstate);
void setup_xstate_comp(void);
#endif