linux/arch/x86
Waiman Long d78045306c locking/pvqspinlock, x86: Optimize the PV unlock code path
The unlock function in queued spinlocks was optimized for better
performance on bare metal systems at the expense of virtualized guests.

For x86-64 systems, the unlock call needs to go through a
PV_CALLEE_SAVE_REGS_THUNK() which saves and restores 8 64-bit
registers before calling the real __pv_queued_spin_unlock()
function. The thunk code may also be in a separate cacheline from
__pv_queued_spin_unlock().

This patch optimizes the PV unlock code path by:

 1) Moving the unlock slowpath code from the fastpath into a separate
    __pv_queued_spin_unlock_slowpath() function to make the fastpath
    as simple as possible..

 2) For x86-64, hand-coded an assembly function to combine the register
    saving thunk code with the fastpath code. Only registers that
    are used in the fastpath will be saved and restored. If the
    fastpath fails, the slowpath function will be called via another
    PV_CALLEE_SAVE_REGS_THUNK(). For 32-bit, it falls back to the C
    __pv_queued_spin_unlock() code as the thunk saves and restores
    only one 32-bit register.

With a microbenchmark of 5M lock-unlock loop, the table below shows
the execution times before and after the patch with different number
of threads in a VM running on a 32-core Westmere-EX box with x86-64
4.2-rc1 based kernels:

  Threads	Before patch	After patch	% Change
  -------	------------	-----------	--------
     1		   134.1 ms	  119.3 ms	  -11%
     2		   1286  ms	   953  ms	  -26%
     3		   3715  ms	  3480  ms	  -6.3%
     4		   4092  ms	  3764  ms	  -8.0%

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1447114167-47185-5-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-23 10:02:02 +01:00
..
boot kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile 2015-11-05 19:34:48 -08:00
configs Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2015-09-04 15:49:32 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-11-04 09:11:12 -08:00
entry mm: mlock: add new mlock system call 2015-11-05 19:34:48 -08:00
ia32 Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 21:05:40 -08:00
include locking/pvqspinlock, x86: Optimize the PV unlock code path 2015-11-23 10:02:02 +01:00
kernel Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-22 12:00:12 -08:00
kvm Four changes: 2015-11-12 14:34:06 -08:00
lguest genirq: Remove irq argument from irq flow handlers 2015-09-16 15:47:51 +02:00
lib Linux 4.3-rc1 2015-09-13 11:25:35 +02:00
math-emu Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 21:05:40 -08:00
mm Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-22 12:00:12 -08:00
net ebpf: migrate bpf_prog's flags to bitfield 2015-10-03 05:02:39 -07:00
oprofile
pci PCI changes for the v4.4 merge window: 2015-11-06 11:29:53 -08:00
platform Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 21:33:18 -08:00
power x86/ldt: Make modify_ldt synchronous 2015-07-31 10:23:23 +02:00
purgatory
ras x86/ras/mce_amd_inj: Inject bank 4 errors on the NBC 2015-10-12 16:15:48 +02:00
realmode
tools
um Merge branch 'for-linus-4.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml 2015-11-10 16:33:37 -08:00
video
xen xen: features for 4.4-rc0 2015-11-04 17:32:42 -08:00
.gitignore
Kbuild x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ 2015-06-03 18:51:37 +02:00
Kconfig Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-11-03 18:59:10 -08:00
Kconfig.cpu x86/Kconfig/cpus: Fix/complete CPU type help texts 2015-10-21 11:12:56 +02:00
Kconfig.debug x86: don't make DEBUG_WX default to 'y' even with DEBUG_RODATA 2015-11-06 09:12:41 -08:00
Makefile Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2015-11-04 09:11:12 -08:00
Makefile_32.cpu
Makefile.um