linux/arch/x86/lib
Mikulas Patocka 02101c45ec x86/asm: Optimize memcpy_flushcache()
I use memcpy_flushcache() in my persistent memory driver for metadata
updates, there are many 8-byte and 16-byte updates and it turns out that
the overhead of memcpy_flushcache causes 2% performance degradation
compared to "movnti" instruction explicitly coded using inline assembler.

The tests were done on a Skylake processor with persistent memory emulated
using the "memmap" kernel parameter. dd was used to copy data to the
dm-writecache target.

This patch recognizes memcpy_flushcache calls with constant short length
and turns them into inline assembler - so that I don't have to use inline
assembler in the driver.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: device-mapper development <dm-devel@redhat.com>
Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1808081720460.24747@file01.intranet.prod.int.rdu2.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-09-10 15:17:12 +02:00
..
.gitignore
atomic64_32.c x86: Adjust asm constraints in atomic64 wrappers 2012-01-20 17:29:31 -08:00
atomic64_386_32.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
atomic64_cx8_32.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
cache-smp.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
checksum_32.S x86/retpoline/checksum32: Convert assembler indirect jumps 2018-01-12 00:14:31 +01:00
clear_page_64.S x86/asm: Trim clear_page.S includes 2018-02-13 17:37:07 +01:00
cmdline.c x86/boot: Add early cmdline parsing for options with arguments 2017-07-18 11:38:06 +02:00
cmpxchg8b_emu.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
cmpxchg16b_emu.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
copy_page_64.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
copy_user_64.S x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings 2017-06-30 09:52:51 +02:00
cpu.c x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping 2018-02-15 01:15:52 +01:00
csum-copy_64.S x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic() 2017-05-05 07:59:24 +02:00
csum-partial_64.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
csum-wrappers_64.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
delay.c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2018-01-30 13:01:09 -08:00
error-inject.c x86/error_inject: Make just_return_func() globally visible 2018-02-13 14:33:35 +01:00
getuser.S x86/get_user: Use pointer masking to limit speculation 2018-01-30 21:54:31 +01:00
hweight.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
inat.c x86: Fix to decode grouped AVX with VEX pp bits 2012-02-11 15:11:35 +01:00
insn-eval.c x86/umip: Fix insn_get_code_seg_params()'s return value 2017-11-23 20:17:59 +01:00
insn.c x86/insn: Add AVX-512 support to the instruction decoder 2016-07-21 09:37:11 -03:00
iomap_copy_64.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
kaslr.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Makefile Revert "x86/retpoline: Simplify vmexit_fill_RSB()" 2018-02-20 09:38:26 +01:00
memcpy_32.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
memcpy_64.S x86/asm/64: Use 32-bit XOR to zero registers 2018-07-03 09:59:29 +02:00
memmove_64.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
memset_64.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
misc.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mmx_32.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
msr-reg-export.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
msr-reg.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
msr-smp.c x86/msr: Make rdmsrl_safe_on_cpu() scheduling safe as well 2018-03-28 10:34:13 +02:00
msr.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
putuser.S License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
retpoline.S Revert "x86/retpoline: Simplify vmexit_fill_RSB()" 2018-02-20 09:38:26 +01:00
rwsem.S locking/arch, x86: Add __down_read_killable() 2017-10-10 11:50:15 +02:00
string_32.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
strstr_32.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
usercopy_32.c x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec 2018-01-30 21:54:31 +01:00
usercopy_64.c x86/asm: Optimize memcpy_flushcache() 2018-09-10 15:17:12 +02:00
usercopy.c x86/nmi: Fix NMI uaccess race against CR3 switching 2018-08-31 17:08:22 +02:00
x86-opcode-map.txt x86/decoder: Fix and update the opcodes map 2017-12-15 13:45:20 +01:00