linux/arch/x86/lib
Paolo Abeni 236222d393 x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings
According to the Intel datasheet, the REP MOVSB instruction
exposes a pretty heavy setup cost (50 ticks), which hurts
short string copy operations.

This change tries to avoid this cost by calling the explicit
loop available in the unrolled code for strings shorter
than 64 bytes.

The 64 bytes cutoff value is arbitrary from the code logic
point of view - it has been selected based on measurements,
as the largest value that still ensures a measurable gain.

Micro benchmarks of the __copy_from_user() function with
lengths in the [0-63] range show this performance gain
(shorter the string, larger the gain):

 - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4
 - in the [72%-9%] range on Intel Core i7-4810MQ

Other tested CPUs - namely Intel Atom S1260 and AMD Opteron
8216 - show no difference, because they do not expose the
ERMS feature bit.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com
[ Clarified the changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-30 09:52:51 +02:00
..
.gitignore
atomic64_32.c
atomic64_386_32.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
atomic64_cx8_32.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
cache-smp.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
checksum_32.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
clear_page_64.S x86/asm: Optimize clear_page() 2017-03-07 08:28:00 +01:00
cmdline.c x86/boot: Pass in size to early cmdline parsing 2016-02-03 12:03:18 +01:00
cmpxchg8b_emu.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
cmpxchg16b_emu.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
copy_page_64.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
copy_user_64.S x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings 2017-06-30 09:52:51 +02:00
cpu.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
csum-copy_64.S x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic() 2017-05-05 07:59:24 +02:00
csum-partial_64.c x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
csum-wrappers_64.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
delay.c Prevent timer value 0 for MWAITX 2017-04-30 13:35:11 +02:00
getuser.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
hweight.S Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-10-14 14:26:58 -07:00
inat.c
insn.c x86/insn: Add AVX-512 support to the instruction decoder 2016-07-21 09:37:11 -03:00
iomap_copy_64.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
kaslr.c x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang incompatibility 2017-05-05 08:31:05 +02:00
Makefile Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 17:32:28 -07:00
memcpy_32.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
memcpy_64.S x86/mce: Fix copy/paste error in exception table entries 2017-03-22 08:43:25 +01:00
memmove_64.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
memset_64.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
misc.c x86/boot: Further compress CPUs bootup message 2013-10-01 10:52:30 +02:00
mmx_32.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
msr-reg-export.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
msr-reg.S x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
msr-smp.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
msr.c x86/msr: Cleanup/streamline MSR helpers 2016-11-16 10:23:02 +01:00
putuser.S x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
rwsem.S locking/rwsem: Fix comment on register clobbering 2016-05-16 12:35:40 +02:00
string_32.c x86/lib: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:58 +02:00
strstr_32.c x86: move exports to actual definitions 2016-08-07 23:47:15 -04:00
usercopy_32.c x86: switch to RAW_COPY_USER 2017-03-29 12:06:28 -04:00
usercopy_64.c x86: don't wank with magical size in __copy_in_user() 2017-03-29 12:04:35 -04:00
usercopy.c x86: switch to RAW_COPY_USER 2017-03-29 12:06:28 -04:00
x86-opcode-map.txt libnvdimm for 4.8 2016-07-28 17:38:16 -07:00