forked from Minki/linux
db38ee874c
Looking at perf profiles of multi-threaded hackbench runs, a significant performance hit appears to manifest from the cmpxchg loop used to implement the 32-bit atomic_add_unless function. This can be mitigated by writing a direct implementation of __atomic_add_unless which doesn't require iteration outside of the atomic operation. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> |
||
---|---|---|
.. | ||
asm | ||
debug | ||
uapi/asm |