linux/kernel
Joel Fernandes f453ae2200 sched/fair: Consider RT/IRQ pressure in capacity_spare_wake()
capacity_spare_wake() in the slow path influences choice of idlest groups,
as we search for groups with maximum spare capacity. In scenarios where
RT pressure is high, a sub optimal group can be chosen and hurt
performance of the task being woken up.

Fix this by using capacity_of() instead of capacity_orig_of() in capacity_spare_wake().

Tests results from improvements with this change are below. More tests
were also done by myself and Matt Fleming to ensure no degradation in
different benchmarks.

1) Rohit ran barrier.c test (details below) with following improvements:
------------------------------------------------------------------------
This was Rohit's original use case for a patch he posted at [1] however
from his recent tests he showed my patch can replace his slow path
changes [1] and there's no need to selectively scan/skip CPUs in
find_idlest_group_cpu in the slow path to get the improvement he sees.

barrier.c (open_mp code) as a micro-benchmark. It does a number of
iterations and barrier sync at the end of each for loop.

Here barrier,c is running in along with ping on CPU 0 and 1 as:
'ping -l 10000 -q -s 10 -f hostX'

barrier.c can be found at:
http://www.spinics.net/lists/kernel/msg2506955.html

Following are the results for the iterations per second with this
micro-benchmark (higher is better), on a 44 core, 2 socket 88 Threads
Intel x86 machine:
+--------+------------------+---------------------------+
|Threads | Without patch    | With patch                |
|        |                  |                           |
+--------+--------+---------+-----------------+---------+
|        | Mean   | Std Dev | Mean            | Std Dev |
+--------+--------+---------+-----------------+---------+
|1       | 539.36 | 60.16   | 572.54 (+6.15%) | 40.95   |
|2       | 481.01 | 19.32   | 530.64 (+10.32%)| 56.16   |
|4       | 474.78 | 22.28   | 479.46 (+0.99%) | 18.89   |
|8       | 450.06 | 24.91   | 447.82 (-0.50%) | 12.36   |
|16      | 436.99 | 22.57   | 441.88 (+1.12%) | 7.39    |
|32      | 388.28 | 55.59   | 429.4  (+10.59%)| 31.14   |
|64      | 314.62 | 6.33    | 311.81 (-0.89%) | 11.99   |
+--------+--------+---------+-----------------+---------+

2) ping+hackbench test on bare-metal sever (by Rohit)
-----------------------------------------------------
Here hackbench is running in threaded mode along
with, running ping on CPU 0 and 1 as:
'ping -l 10000 -q -s 10 -f hostX'

This test is running on 2 socket, 20 core and 40 threads Intel x86
machine:
Number of loops is 10000 and runtime is in seconds (Lower is better).

+--------------+-----------------+--------------------------+
|Task Groups   | Without patch   |  With patch              |
|              +-------+---------+----------------+---------+
|(Groups of 40)| Mean  | Std Dev |  Mean          | Std Dev |
+--------------+-------+---------+----------------+---------+
|1             | 0.851 | 0.007   |  0.828 (+2.77%)| 0.032   |
|2             | 1.083 | 0.203   |  1.087 (-0.37%)| 0.246   |
|4             | 1.601 | 0.051   |  1.611 (-0.62%)| 0.055   |
|8             | 2.837 | 0.060   |  2.827 (+0.35%)| 0.031   |
|16            | 5.139 | 0.133   |  5.107 (+0.63%)| 0.085   |
|25            | 7.569 | 0.142   |  7.503 (+0.88%)| 0.143   |
+--------------+-------+---------+----------------+---------+

[1] https://patchwork.kernel.org/patch/9991635/

Matt Fleming also ran several different hackbench tests and cyclic test
to santiy-check that the patch doesn't harm other usecases.

Tested-by: Matt Fleming <matt@codeblueprint.co.uk>
Tested-by: Rohit Jain <rohit.k.jain@oracle.com>
Signed-off-by: Joel Fernandes <joelaf@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Atish Patra <atish.patra@oracle.com>
Cc: Brendan Jackman <brendan.jackman@arm.com>
Cc: Chris Redpath <Chris.Redpath@arm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Len Brown <lenb@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Ramussen <morten.rasmussen@arm.com>
Cc: Patrick Bellasi <patrick.bellasi@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Saravana Kannan <skannan@quicinc.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steve Muckle <smuckle@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20171214212158.188190-1-joelaf@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-10 11:30:28 +01:00
..
bpf bpf: do not allow root to mangle valid pointers 2017-12-21 02:26:29 +01:00
cgroup Revert "cgroup/cpuset: remove circular dependency deadlock" 2017-12-04 14:55:59 -08:00
configs ANDROID: binder: add hwbinder,vndbinder to BINDER_DEVICES. 2017-08-22 18:43:23 -07:00
debug kdb: Fix handling of kallsyms_symbol_next() return value 2017-12-06 16:12:43 -06:00
events locking/barriers: Convert users of lockless_dereference() to READ_ONCE() 2017-12-17 13:57:15 +01:00
gcov License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irq genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI 2017-12-29 21:13:05 +01:00
livepatch Merge branch 'for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2017-11-15 10:21:58 -08:00
locking locking/lockdep: Remove the cross-release locking checks 2017-12-12 12:38:51 +01:00
power Revert "cpuset: Make cpuset hotplug synchronous" 2017-12-04 14:41:11 -08:00
printk remove task and stack pointer printout from oops dump 2017-12-05 08:23:20 -08:00
rcu Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-11-13 17:56:58 -08:00
sched sched/fair: Consider RT/IRQ pressure in capacity_spare_wake() 2018-01-10 11:30:28 +01:00
time Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-12-31 12:30:34 -08:00
trace tracing: Fix possible double free on failure of allocating trace buffer 2017-12-27 14:21:27 -05:00
.gitignore
acct.c kernel/acct.c: fix the acct->needcheck check in check_free_space() 2018-01-04 16:45:09 -08:00
async.c
audit_fsnotify.c
audit_tree.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-11-14 14:08:20 -08:00
audit_watch.c audit/stable-4.13 PR 20170816 2017-08-16 16:48:34 -07:00
audit.c Audit: remove unused audit_log_secctx function 2017-11-10 16:08:47 -05:00
audit.h audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
auditfilter.c audit: filter PATH records keyed on filesystem magic 2017-11-10 16:08:56 -05:00
auditsc.c audit/stable-4.15 PR 20171113 2017-11-15 13:28:48 -08:00
backtracetest.c
bounds.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
capability.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat.c sched_rr_get_interval(): move compat to native, get rid of set_fs() 2017-09-20 00:30:57 -04:00
configs.c
context_tracking.c
cpu_pm.c PM / CPU: replace raw_notifier with atomic_notifier 2017-07-31 13:09:49 +02:00
cpu.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-12-31 12:30:34 -08:00
crash_core.c kdump: print a message in case parse_crashkernel_mem resulted in zero bytes 2017-11-17 16:10:03 -08:00
crash_dump.c
cred.c
delayacct.c
dma.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elfcore.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
exec_domain.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
exit.c kernel/exit.c: export abort() to modules 2018-01-04 16:45:09 -08:00
extable.c kprobes, x86/alternatives: Use text_mutex to protect smp_alt_modules 2017-11-07 12:20:09 +01:00
fork.c Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-12-23 11:53:04 -08:00
freezer.c
futex_compat.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
futex.c futex: futex_wake_op, fix sign_extend32 sign bits 2017-12-10 12:50:57 -08:00
groups.c kernel: make groups_sort calling a responsibility group_info allocators 2017-12-14 16:00:49 -08:00
hung_task.c
irq_work.c Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-11-13 17:33:11 -08:00
jump_label.c jump_label: Invoke jump_label_test() via early_initcall() 2017-11-14 08:41:41 +01:00
kallsyms.c kallsyms: take advantage of the new '%px' format 2017-11-29 10:30:13 -08:00
kcmp.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: fix comparison callback signature 2017-12-14 16:00:48 -08:00
kexec_core.c x86/mm, kexec: Allow kexec to be used with SME 2017-07-18 11:38:04 +02:00
kexec_file.c resource: Provide resource struct in resource walk callback 2017-11-07 15:35:57 +01:00
kexec_internal.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kexec.c kdump: protect vmcoreinfo data under the crash memory 2017-07-12 16:26:00 -07:00
kmod.c kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
kprobes.c kprobes: Disable the jprobes APIs 2017-10-20 11:02:29 +02:00
ksysfs.c kexec: move vmcoreinfo out of the kernel's .bss section 2017-07-12 16:25:59 -07:00
kthread.c treewide: Remove TIMER_FUNC_TYPE and TIMER_DATA_TYPE casts 2017-11-21 16:35:54 -08:00
latencytop.c
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
memremap.c memremap: add scheduling point to devm_memremap_pages 2017-10-03 17:54:25 -07:00
module_signing.c
module-internal.h
module.c kallsyms: take advantage of the new '%px' format 2017-11-29 10:30:13 -08:00
notifier.c
nsproxy.c
padata.c treewide: setup_timer() -> timer_setup() 2017-11-21 15:57:07 -08:00
panic.c kernel/panic.c: add TAINT_AUX 2017-11-17 16:10:04 -08:00
params.c kernel/params.c: improve STANDARD_PARAM_DEF readability 2017-10-03 17:54:26 -07:00
pid_namespace.c pid: remove pidhash 2017-11-17 16:10:04 -08:00
pid.c pid: Handle failure to allocate the first pid in a pid namespace 2017-12-23 21:00:09 -06:00
profile.c
ptrace.c signal: Remove kernel interal si_code magic 2017-07-24 14:30:28 -05:00
range.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
reboot.c kernel/reboot.c: add devm_register_reboot_notifier() 2017-11-17 16:10:04 -08:00
relay.c
resource.c x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages 2017-11-07 15:35:58 +01:00
seccomp.c locking/barriers: Convert users of lockless_dereference() to READ_ONCE() 2017-12-17 13:57:15 +01:00
signal.c Merge branch 'akpm' (patches from Andrew) 2017-11-17 16:56:17 -08:00
smp.c smp/core: Use lockdep to assert IRQs are disabled/enabled 2017-11-08 11:13:50 +01:00
smpboot.c watchdog/core, powerpc: Lock cpus across reconfiguration 2017-10-04 10:53:54 +02:00
smpboot.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
softirq.c kmemcheck: rip it out 2017-11-15 18:21:05 -08:00
stacktrace.c
stop_machine.c
sys_ni.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sys.c arm64 updates for 4.15 2017-11-15 10:56:56 -08:00
sysctl_binary.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sysctl.c kernel/sysctl.c: code cleanups 2017-11-17 16:10:03 -08:00
task_work.c locking/barriers: Convert users of lockless_dereference() to READ_ONCE() 2017-12-17 13:57:15 +01:00
taskstats.c
test_kprobes.c kprobes: Disable the jprobes test code 2017-10-20 11:02:54 +02:00
torture.c torture: Fix typo suppressing CPU-hotplug statistics 2017-07-25 13:04:45 -07:00
tracepoint.c
tsacct.c
ucount.c
uid16.c kernel: make groups_sort calling a responsibility group_info allocators 2017-12-14 16:00:49 -08:00
umh.c kernel/umh.c: optimize 'proc_cap_handler()' 2017-11-17 16:10:01 -08:00
up.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-11-16 12:20:15 -08:00
user-return-notifier.c
user.c userns: use union in {g,u}idmap struct 2017-10-31 17:22:58 -05:00
utsname_sysctl.c
utsname.c
watchdog_hld.c Merge branch 'linus' into core/urgent, to pick up dependent commits 2017-11-04 08:53:04 +01:00
watchdog.c Merge branch 'linus' into sched/core, to pick up fixes 2017-11-08 10:17:15 +01:00
workqueue_internal.h Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2017-11-06 12:26:49 -08:00
workqueue.c workqueue: remove unneeded kallsyms include 2017-12-11 07:15:43 -08:00