linux/arch/x86/include/asm
Suresh Siddha 841e3604d3 x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage
use kernel_fpu_begin/end() instead of unconditionally accessing cr0 and
saving/restoring just the few used xmm/ymm registers.

This has some advantages like:
* If the task's FPU state is already active, then kernel_fpu_begin()
  will just save the user-state and avoiding the read/write of cr0.
  In general, cr0 accesses are much slower.

* Manual save/restore of xmm/ymm registers will affect the 'modified' and
  the 'init' optimizations brought in the by xsaveopt/xrstor
  infrastructure.

* Foward compatibility with future vector register extensions will be a
  problem if the xmm/ymm registers are manually saved and restored
  (corrupting the extended state of those vector registers).

With this patch, there was no significant difference in the xor throughput
using AVX, measured during boot.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1345842782-24175-5-git-send-email-suresh.b.siddha@intel.com
Cc: Jim Kukunas <james.t.kukunas@linux.intel.com>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-09-18 15:52:08 -07:00
..
crypto crypto: move arch/x86/include/asm/aes.h to arch/x86/include/asm/crypto/ 2012-06-27 14:42:03 +08:00
numachip
uv Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
visws
xen xen/mce: Add mcelog support for Xen platform 2012-07-19 15:51:36 -04:00
a.out-core.h
a.out.h
acpi.h x86, realmode: Unbreak the ia64 build of drivers/acpi/sleep.c 2012-05-30 10:12:48 -07:00
agp.h
alternative-asm.h
alternative.h x86/copy_user_generic: Optimize copy_user_generic with CPU erms feature 2012-06-29 15:33:34 -07:00
amd_nb.h x86, MCE, AMD: Move shared bank to node descriptor 2012-06-07 12:43:44 +02:00
apb_timer.h
apic_flat_64.h
apic.h Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
apicdef.h x86/apic: Fix typo EIO_ACK -> EOI_ACK and document it 2012-05-18 09:46:07 +02:00
apm.h
arch_hweight.h
archrandom.h
asm-offsets.h
asm.h x86, extable: Switch to relative exception table entries 2012-04-20 17:22:34 -07:00
atomic64_32.h atomic64_32.h: fix parameter naming mismatch 2012-05-09 11:38:20 +02:00
atomic64_64.h
atomic.h
auxvec.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
barrier.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
bios_ebda.h
bitops.h x86, bitops: note on __test_and_clear_bit atomicity 2012-06-25 12:38:35 +03:00
bitsperlong.h
boot.h x86: Use common threadinfo allocator 2012-05-08 14:08:44 +02:00
bootparam.h x86, efi: Handover Protocol 2012-07-20 16:18:58 -07:00
bug.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
bugs.h
byteorder.h
cache.h
cacheflush.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
calgary.h
calling.h
ce4100.h
checksum_32.h
checksum_64.h
checksum.h
clocksource.h
cmpxchg_32.h
cmpxchg_64.h
cmpxchg.h x86: Use correct byte-sized register constraint in __add() 2012-04-06 09:40:07 -07:00
compat.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
cpu_device_id.h Add driver auto probing for x86 features v4 2012-01-26 16:44:41 -08:00
cpu.h
cpufeature.h x86, cpufeature: Add the RDSEED and ADX features 2012-07-20 13:36:41 -07:00
cpumask.h
cputime.h
current.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
debugreg.h x86: relocate get/set debugreg fcns to include/asm/debugreg. 2012-02-28 17:48:04 -05:00
delay.h
desc_defs.h
desc.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
device.h x86-32: Introduce CONFIG_X86_DEV_DMA_OPS 2012-04-12 11:09:56 -07:00
div64.h
dma-contiguous.h X86: integrate CMA with DMA-mapping subsystem 2012-05-21 15:09:38 +02:00
dma-mapping.h Merge branch 'for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping 2012-05-25 09:18:59 -07:00
dma.h
dmi.h
dwarf2.h
e820.h
edac.h
efi.h x86, efi: Allow basic init with mixed 32/64-bit efi/kernel 2012-02-23 18:54:51 -08:00
elf.h Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-29 18:12:23 -07:00
emergency-restart.h x86-64, reboot: Allow reboot=bios and reboot-cpu override on x86-64 2012-06-17 10:51:01 -07:00
entry_arch.h x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR 2012-06-27 19:29:13 -07:00
errno.h
exec.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
fb.h
fcntl.h
fixmap.h
floppy.h x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
fpu-internal.h x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig() 2012-09-18 15:52:06 -07:00
frame.h
ftrace.h ftrace: Synchronize variable setting with breakpoints 2012-05-31 23:12:17 -04:00
futex.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
gart.h
genapic.h
geode.h
gpio.h gpiolib/arches: Centralise bolierplate asm/gpio.h 2012-05-11 18:00:14 -06:00
hardirq.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
highmem.h highmem: kill all __kmap_atomic() 2012-03-20 21:48:30 +08:00
hpet.h
hugetlb.h
hw_breakpoint.h
hw_irq.h
hypertransport.h
hyperv.h
hypervisor.h KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check 2012-07-11 19:33:32 +03:00
i387.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
i8259.h
ia32_unistd.h
ia32.h signal, x86: add SIGSYS info and make it synchronous. 2012-04-14 11:13:21 +10:00
idle.h x86: Merge the x86_32 and x86_64 cpu_idle() functions 2012-03-26 03:16:07 +02:00
inat_types.h
inat.h x86: Fix to decode grouped AVX with VEX pp bits 2012-02-11 15:11:35 +01:00
init.h
insn.h x86: Fix to decode grouped AVX with VEX pp bits 2012-02-11 15:11:35 +01:00
inst.h
intel_scu_ipc.h
io_apic.h x86/apic: Replace io_apic_ops with x86_io_apic_ops. 2012-05-01 14:50:09 -04:00
io.h
ioctl.h
ioctls.h
iomap.h
iommu_table.h x86/iommu: Use NULL instead of plain 0 for __IOMMU_INIT 2012-09-05 10:52:26 +02:00
iommu.h iommu: Remove group_mf 2012-06-25 13:48:30 +02:00
ipcbuf.h
ipi.h
irq_regs.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
irq_remapping.h irq_remap: Fix compiler warning with CONFIG_IRQ_REMAP=y 2012-05-08 11:17:29 +02:00
irq_vectors.h x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR 2012-06-27 19:29:13 -07:00
irq.h
irqflags.h
ist.h
jump_label.h static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]() 2012-02-24 10:05:59 +01:00
kbdleds.h keyboard: Use BIOS Keyboard variable to set Numlock 2012-05-08 14:19:41 -07:00
Kbuild x32: Generate <asm/unistd_x32.h> 2012-02-20 12:51:00 -08:00
kdebug.h x86: Avoid double stack traces with show_regs() 2012-05-09 11:44:42 +02:00
kexec.h
kgdb.h kgdb: x86: Return all segment registers also in 64-bit mode 2012-03-22 15:07:15 -05:00
kmap_types.h
kmemcheck.h
kprobes.h
kvm_emulate.h KVM: x86 emulator: initialize memop 2012-07-09 14:19:02 +03:00
kvm_host.h KVM updates for the 3.6 merge window 2012-07-24 12:01:20 -07:00
kvm_para.h KVM guest: guest side for eoi avoidance 2012-06-25 12:38:06 +03:00
kvm.h KVM: Introduce __KVM_HAVE_IRQ_LINE 2012-06-18 16:06:35 +03:00
ldt.h
lguest_hcall.h
lguest.h
linkage.h
local64.h
local.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
mach_timer.h
mach_traps.h
math_emu.h
mc146818rtc.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
mce.h x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h> 2012-07-26 15:05:47 +02:00
microcode.h
mman.h
mmconfig.h
mmu_context.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
mmu.h
mmx.h
mmzone_32.h x86: Drop obsolete ARCH_BOOTMEM support 2012-04-14 14:28:58 +02:00
mmzone_64.h
mmzone.h
module.h
mpspec_def.h MCA: delete all remaining traces of microchannel bus support. 2012-05-17 19:06:13 -04:00
mpspec.h MCA: delete all remaining traces of microchannel bus support. 2012-05-17 19:06:13 -04:00
mrst-vrtc.h
mrst.h x86/mid: Remove Intel Moorestown 2012-01-26 21:23:53 +01:00
msgbuf.h
mshyperv.h
msidef.h
msr-index.h Merge branch 'perf/x86-ibs' into perf/core 2012-05-09 15:22:23 +02:00
msr.h Merge branch 'x86/cpu' into perf/core 2012-07-05 21:12:11 +02:00
mtrr.h x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls 2012-03-01 12:48:52 -08:00
mutex_32.h
mutex_64.h
mutex.h
mwait.h
nmi.h x86/nmi: Clean up register_nmi_handler() usage 2012-06-20 14:23:17 +02:00
nops.h x86, nop: Make the ASM_NOP* macros work from assembly 2012-04-19 15:07:42 -07:00
numa_32.h
numa_64.h
numa.h
numaq.h
olpc_ofw.h
olpc.h x86: OLPC: switch over to using new EC driver on x86 2012-07-31 23:27:30 -04:00
page_32_types.h x86: Use common threadinfo allocator 2012-05-08 14:08:44 +02:00
page_32.h
page_64_types.h x86: Use common threadinfo allocator 2012-05-08 14:08:44 +02:00
page_64.h
page_types.h Move all declarations of free_initmem() to linux/mm.h 2012-03-28 18:30:03 +01:00
page.h
param.h
paravirt_types.h Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
paravirt.h Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
parport.h
pat.h
pci_64.h
pci_x86.h PCI changes for the 3.6 merge window: 2012-07-24 16:17:07 -07:00
pci-direct.h
pci-functions.h
pci.h
percpu.h x86: Define early read-mostly per-cpu macros 2012-06-14 12:42:10 +02:00
perf_event_p4.h
perf_event.h perf/x86: Fix USER/KERNEL tagging of samples properly 2012-07-31 17:02:04 +02:00
pgalloc.h
pgtable_32_types.h
pgtable_32.h
pgtable_64_types.h
pgtable_64.h x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
pgtable_types.h
pgtable-2level_types.h
pgtable-2level.h x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level> 2012-06-06 09:17:22 +02:00
pgtable-3level_types.h
pgtable-3level.h Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 12:04:44 -07:00
pgtable.h
poll.h
posix_types_32.h bury __kernel_nlink_t, make internal nlink_t consistent 2012-05-30 21:04:50 -04:00
posix_types_64.h x86: Use generic posix_types.h 2012-02-14 12:01:30 -08:00
posix_types_x32.h x32: Create posix_types_x32.h 2012-02-20 12:48:47 -08:00
posix_types.h x32: Check __ILP32__ instead of __LP64__ for x32 2012-04-23 14:51:14 -07:00
prctl.h
probe_roms.h
processor-cyrix.h
processor-flags.h KVM: VMX: Implement PCID/INVPCID for guests with EPT 2012-07-12 13:07:34 +03:00
processor.h x86/tlb: add tlb_flushall_shift for specific CPU 2012-06-27 19:29:10 -07:00
prom.h irq_domain/x86: Convert x86 (embedded) to use common irq_domain 2012-02-23 14:37:47 -07:00
proto.h
ptrace-abi.h
ptrace.h x86: Move some signal-handling definitions to a common header 2012-02-20 12:52:04 -08:00
pvclock-abi.h x86: pvclock: Add flag to indicate that a vm was stopped by the host 2012-04-08 12:48:57 +03:00
pvclock.h
realmode.h x86-64, reboot: Allow reboot=bios and reboot-cpu override on x86-64 2012-06-17 10:51:01 -07:00
reboot_fixups.h
reboot.h x86-64, reboot: Allow reboot=bios and reboot-cpu override on x86-64 2012-06-17 10:51:01 -07:00
required-features.h
resource.h
resume-trace.h
rio.h
rtc.h
rwlock.h
rwsem.h
scatterlist.h
seccomp_32.h
seccomp_64.h
seccomp.h
sections.h
segment.h x86-64: Handle exception table entries during early boot 2012-04-19 15:42:45 -07:00
sembuf.h
serial.h
setup_arch.h
setup.h
shmbuf.h
shmparam.h
sigcontext32.h
sigcontext.h x32: Check __ILP32__ instead of __LP64__ for x32 2012-04-23 14:51:14 -07:00
sigframe.h x32: Add rt_sigframe_x32 2012-02-20 12:52:05 -08:00
sighandling.h most of set_current_blocked() callers want SIGKILL/SIGSTOP removed from set 2012-06-01 12:58:51 -04:00
siginfo.h x32, siginfo: Provide proper overrides for x32 siginfo_t 2012-04-23 18:11:40 -07:00
signal.h x86, signal: Cleanup ifdefs and is_ia32, is_x32 2012-09-18 15:51:26 -07:00
smp.h Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-26 13:17:17 -07:00
smpboot_hooks.h
socket.h
sockios.h
sparsemem.h
special_insns.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
spinlock_types.h x86/spinlocks: Eliminate TICKET_MASK 2012-02-07 10:09:54 +01:00
spinlock.h x86/spinlocks: Fix comment in spinlock.h 2012-08-22 09:52:47 +02:00
sta2x11.h mfd: Add driver for STA2X11 MFD block 2012-05-09 15:34:28 +02:00
stackprotector.h x86: replace percpu_xxx funcs with this_cpu_xxx 2012-05-14 14:15:31 -07:00
stacktrace.h
stat.h vfs: don't force a big memset of stat data just to clear padding fields 2012-05-06 18:02:40 -07:00
statfs.h
string_32.h
string_64.h
string.h
suspend_32.h
suspend_64.h
suspend.h
svm.h
swab.h
swiotlb.h
switch_to.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
sync_bitops.h
sys_ia32.h x86: Fix __user annotations in asm/sys_ia32.h 2012-09-05 10:52:23 +02:00
syscall.h arch/x86: add syscall_get_arch to syscall.h 2012-04-14 11:13:20 +10:00
syscalls.h
tce.h
termbits.h
termios.h
thread_info.h set_restore_sigmask() is never called without SIGPENDING (and never should be) 2012-06-01 12:58:50 -04:00
time.h
timer.h sched/x86: Fix overflow in cyc2ns_offset 2012-03-13 16:27:51 +01:00
timex.h
tlb.h x86/tlb: enable tlb flush range support for x86 2012-06-27 19:29:11 -07:00
tlbflush.h x86/tlb: Fix build warning and crash when building for !SMP 2012-07-20 15:01:48 -07:00
topology.h sched/numa: Rewrite the CONFIG_NUMA sched domain support 2012-05-09 15:00:55 +02:00
traps.h x86: Use enum instead of literals for trap values 2012-03-09 16:47:54 -08:00
tsc.h x86: kvmclock: abstract save/restore sched_clock_state 2012-03-20 12:37:45 +02:00
types.h
uaccess_32.h x86: use the new generic strnlen_user() function 2012-05-26 11:33:54 -07:00
uaccess_64.h x86/copy_user_generic: Optimize copy_user_generic with CPU erms feature 2012-06-29 15:33:34 -07:00
uaccess.h perf/x86: Check if user fp is valid 2012-06-06 17:08:01 +02:00
ucontext.h
unaligned.h
unistd.h ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION 2012-07-30 17:25:21 -07:00
uprobes.h uprobes: Pass probed vaddr to arch_uprobe_analyze_insn() 2012-06-08 12:22:27 +02:00
user32.h
user_32.h
user_64.h
user.h
vdso.h x86/vdso: Add __user annotation to VDSO32_SYMBOL 2012-09-05 10:52:23 +02:00
vga.h efifb: Implement vga_default_device() (v2) 2012-04-24 09:50:18 +01:00
vgtod.h x86-64: Simplify and optimize vdso clock_gettime monotonic variants 2012-03-23 16:49:33 -07:00
virtext.h Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
vm86.h
vmx.h KVM: VMX: Implement PCID/INVPCID for guests with EPT 2012-07-12 13:07:34 +03:00
vsyscall.h
vvar.h
word-at-a-time.h word-at-a-time: make the interfaces truly generic 2012-05-26 11:33:40 -07:00
x2apic.h x86/apic: Factor out default target_cpus() operation 2012-06-06 10:22:17 +02:00
x86_init.h Merge branch 'x86/apic' into x86/platform 2012-06-18 11:09:49 +02:00
xcr.h
xor_32.h x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage 2012-09-18 15:52:08 -07:00
xor_64.h x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage 2012-09-18 15:52:08 -07:00
xor_avx.h x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage 2012-09-18 15:52:08 -07:00
xor.h
xsave.h x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels 2012-09-18 15:51:48 -07:00