mirror of
https://github.com/torvalds/linux.git
synced 2024-12-01 16:41:39 +00:00
617f3ef951
Reader optimistic spinning is helpful when the reader critical section
is short and there aren't that many readers around. It also improves
the chance that a reader can get the lock as writer optimistic spinning
disproportionally favors writers much more than readers.
Since commit d3681e269f
("locking/rwsem: Wake up almost all readers
in wait queue"), all the waiting readers are woken up so that they can
all get the read lock and run in parallel. When the number of contending
readers is large, allowing reader optimistic spinning will likely cause
reader fragmentation where multiple smaller groups of readers can get
the read lock in a sequential manner separated by writers. That reduces
reader parallelism.
One possible way to address that drawback is to limit the number of
readers (preferably one) that can do optimistic spinning. These readers
act as representatives of all the waiting readers in the wait queue as
they will wake up all those waiting readers once they get the lock.
Alternatively, as reader optimistic lock stealing has already enhanced
fairness to readers, it may be easier to just remove reader optimistic
spinning and simplifying the optimistic spinning code as a result.
Performance measurements (locking throughput kops/s) using a locking
microbenchmark with 50/50 reader/writer distribution and turbo-boost
disabled was done on a 2-socket Cascade Lake system (48-core 96-thread)
to see the impacts of these changes:
1) Vanilla - 5.10-rc3 kernel
2) Before - 5.10-rc3 kernel with previous patches in this series
2) limit-rspin - 5.10-rc3 kernel with limited reader spinning patch
3) no-rspin - 5.10-rc3 kernel with reader spinning disabled
# of threads CS Load Vanilla Before limit-rspin no-rspin
------------ ------- ------- ------ ----------- --------
2 1 5,185 5,662 5,214 5,077
4 1 5,107 4,983 5,188 4,760
8 1 4,782 4,564 4,720 4,628
16 1 4,680 4,053 4,567 3,402
32 1 4,299 1,115 1,118 1,098
64 1 3,218 983 1,001 957
96 1 1,938 944 957 930
2 20 2,008 2,128 2,264 1,665
4 20 1,390 1,033 1,046 1,101
8 20 1,472 1,155 1,098 1,213
16 20 1,332 1,077 1,089 1,122
32 20 967 914 917 980
64 20 787 874 891 858
96 20 730 836 847 844
2 100 372 356 360 355
4 100 492 425 434 392
8 100 533 537 529 538
16 100 548 572 568 598
32 100 499 520 527 537
64 100 466 517 526 512
96 100 406 497 506 509
The column "CS Load" represents the number of pause instructions issued
in the locking critical section. A CS load of 1 is extremely short and
is not likey in real situations. A load of 20 (moderate) and 100 (long)
are more realistic.
It can be seen that the previous patches in this series have reduced
performance in general except in highly contended cases with moderate
or long critical sections that performance improves a bit. This change
is mostly caused by the "Prevent potential lock starvation" patch that
reduce reader optimistic spinning and hence reduce reader fragmentation.
The patch that further limit reader optimistic spinning doesn't seem to
have too much impact on overall performance as shown in the benchmark
data.
The patch that disables reader optimistic spinning shows reduced
performance at lightly loaded cases, but comparable or slightly better
performance on with heavier contention.
This patch just removes reader optimistic spinning for now. As readers
are not going to do optimistic spinning anymore, we don't need to
consider if the OSQ is empty or not when doing lock stealing.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lkml.kernel.org/r/20201121041416.12285-6-longman@redhat.com
70 lines
3.1 KiB
C
70 lines
3.1 KiB
C
/* SPDX-License-Identifier: GPL-2.0 */
|
|
/*
|
|
* This program is free software; you can redistribute it and/or modify
|
|
* it under the terms of the GNU General Public License as published by
|
|
* the Free Software Foundation; either version 2 of the License, or
|
|
* (at your option) any later version.
|
|
*
|
|
* This program is distributed in the hope that it will be useful,
|
|
* but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
* GNU General Public License for more details.
|
|
*
|
|
* Authors: Waiman Long <longman@redhat.com>
|
|
*/
|
|
|
|
#ifndef LOCK_EVENT
|
|
#define LOCK_EVENT(name) LOCKEVENT_ ## name,
|
|
#endif
|
|
|
|
#ifdef CONFIG_QUEUED_SPINLOCKS
|
|
#ifdef CONFIG_PARAVIRT_SPINLOCKS
|
|
/*
|
|
* Locking events for PV qspinlock.
|
|
*/
|
|
LOCK_EVENT(pv_hash_hops) /* Average # of hops per hashing operation */
|
|
LOCK_EVENT(pv_kick_unlock) /* # of vCPU kicks issued at unlock time */
|
|
LOCK_EVENT(pv_kick_wake) /* # of vCPU kicks for pv_latency_wake */
|
|
LOCK_EVENT(pv_latency_kick) /* Average latency (ns) of vCPU kick */
|
|
LOCK_EVENT(pv_latency_wake) /* Average latency (ns) of kick-to-wakeup */
|
|
LOCK_EVENT(pv_lock_stealing) /* # of lock stealing operations */
|
|
LOCK_EVENT(pv_spurious_wakeup) /* # of spurious wakeups in non-head vCPUs */
|
|
LOCK_EVENT(pv_wait_again) /* # of wait's after queue head vCPU kick */
|
|
LOCK_EVENT(pv_wait_early) /* # of early vCPU wait's */
|
|
LOCK_EVENT(pv_wait_head) /* # of vCPU wait's at the queue head */
|
|
LOCK_EVENT(pv_wait_node) /* # of vCPU wait's at non-head queue node */
|
|
#endif /* CONFIG_PARAVIRT_SPINLOCKS */
|
|
|
|
/*
|
|
* Locking events for qspinlock
|
|
*
|
|
* Subtracting lock_use_node[234] from lock_slowpath will give you
|
|
* lock_use_node1.
|
|
*/
|
|
LOCK_EVENT(lock_pending) /* # of locking ops via pending code */
|
|
LOCK_EVENT(lock_slowpath) /* # of locking ops via MCS lock queue */
|
|
LOCK_EVENT(lock_use_node2) /* # of locking ops that use 2nd percpu node */
|
|
LOCK_EVENT(lock_use_node3) /* # of locking ops that use 3rd percpu node */
|
|
LOCK_EVENT(lock_use_node4) /* # of locking ops that use 4th percpu node */
|
|
LOCK_EVENT(lock_no_node) /* # of locking ops w/o using percpu node */
|
|
#endif /* CONFIG_QUEUED_SPINLOCKS */
|
|
|
|
/*
|
|
* Locking events for rwsem
|
|
*/
|
|
LOCK_EVENT(rwsem_sleep_reader) /* # of reader sleeps */
|
|
LOCK_EVENT(rwsem_sleep_writer) /* # of writer sleeps */
|
|
LOCK_EVENT(rwsem_wake_reader) /* # of reader wakeups */
|
|
LOCK_EVENT(rwsem_wake_writer) /* # of writer wakeups */
|
|
LOCK_EVENT(rwsem_opt_lock) /* # of opt-acquired write locks */
|
|
LOCK_EVENT(rwsem_opt_fail) /* # of failed optspins */
|
|
LOCK_EVENT(rwsem_opt_nospin) /* # of disabled optspins */
|
|
LOCK_EVENT(rwsem_rlock) /* # of read locks acquired */
|
|
LOCK_EVENT(rwsem_rlock_steal) /* # of read locks by lock stealing */
|
|
LOCK_EVENT(rwsem_rlock_fast) /* # of fast read locks acquired */
|
|
LOCK_EVENT(rwsem_rlock_fail) /* # of failed read lock acquisitions */
|
|
LOCK_EVENT(rwsem_rlock_handoff) /* # of read lock handoffs */
|
|
LOCK_EVENT(rwsem_wlock) /* # of write locks acquired */
|
|
LOCK_EVENT(rwsem_wlock_fail) /* # of failed write lock acquisitions */
|
|
LOCK_EVENT(rwsem_wlock_handoff) /* # of write lock handoffs */
|