linux/kernel/sched
John Stultz ddae0ca2a8 sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath
It was reported that in moving to 6.1, a larger then 10%
regression was seen in the performance of
clock_gettime(CLOCK_THREAD_CPUTIME_ID,...).

Using a simple reproducer, I found:
5.10:
100000000 calls in 24345994193 ns => 243.460 ns per call
100000000 calls in 24288172050 ns => 242.882 ns per call
100000000 calls in 24289135225 ns => 242.891 ns per call

6.1:
100000000 calls in 28248646742 ns => 282.486 ns per call
100000000 calls in 28227055067 ns => 282.271 ns per call
100000000 calls in 28177471287 ns => 281.775 ns per call

The cause of this was finally narrowed down to the addition of
psi_account_irqtime() in update_rq_clock_task(), in commit
52b1364ba0 ("sched/psi: Add PSI_IRQ to track IRQ/SOFTIRQ
pressure").

In my initial attempt to resolve this, I leaned towards moving
all accounting work out of the clock_gettime() call path, but it
wasn't very pretty, so it will have to wait for a later deeper
rework. Instead, Peter shared this approach:

Rework psi_account_irqtime() to use its own psi_irq_time base
for accounting, and move it out of the hotpath, calling it
instead from sched_tick() and __schedule().

In testing this, we found the importance of ensuring
psi_account_irqtime() is run under the rq_lock, which Johannes
Weiner helpfully explained, so also add some lockdep annotations
to make that requirement clear.

With this change the performance is back in-line with 5.10:
6.1+fix:
100000000 calls in 24297324597 ns => 242.973 ns per call
100000000 calls in 24318869234 ns => 243.189 ns per call
100000000 calls in 24291564588 ns => 242.916 ns per call

Reported-by: Jimmy Shiu <jimmyshiu@google.com>
Originally-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: John Stultz <jstultz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Qais Yousef <qyousef@layalina.io>
Link: https://lore.kernel.org/r/20240618215909.4099720-1-jstultz@google.com
2024-07-01 13:01:44 +02:00
..
autogroup.c scheduler: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:54 +02:00
autogroup.h sched/headers: Add header guard to kernel/sched/stats.h and kernel/sched/autogroup.h 2022-02-23 08:22:00 +01:00
build_policy.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
build_utility.c sched/headers: Remove duplicate header inclusions 2023-10-03 21:27:55 +02:00
clock.c Locking changes for v6.5: 2023-06-27 14:14:30 -07:00
completion.c sched: add a few helpers to wake up tasks on the current cpu 2023-07-17 16:08:08 -07:00
core_sched.c sched: Rename task_running() to task_on_cpu() 2022-09-07 21:53:47 +02:00
core.c sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
cpuacct.c Merge branch 'sched/fast-headers' into sched/core 2022-03-15 09:05:05 +01:00
cpudeadline.c sched/topology: Consolidate and clean up access to a CPU's max compute capacity 2023-10-09 12:59:48 +02:00
cpudeadline.h
cpufreq_schedutil.c sched/fair: Fix frequency selection for non-invariant case 2024-01-16 10:41:25 +01:00
cpufreq.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
cpupri.c sched/rt: Fix live lock between select_fallback_rq() and RT push 2023-09-28 22:58:13 +02:00
cpupri.h
cputime.c sched/vtime: Get rid of generic vtime_task_switch() implementation 2024-04-17 13:37:20 +02:00
deadline.c sched/deadline: Fix task_struct reference leak 2024-07-01 13:01:44 +02:00
debug.c sched/debug: Dump domains' level 2024-05-17 09:48:25 +02:00
fair.c Revert "sched/fair: Make sure to try to detach at least one movable task" 2024-07-01 13:01:43 +02:00
features.h sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) 2023-12-23 15:59:56 +01:00
idle.c Core x86 changes for v6.9: 2024-03-11 19:53:15 -07:00
isolation.c sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU 2024-04-28 10:08:21 +02:00
loadavg.c sched/balancing: Rename scheduler_tick() => sched_tick() 2024-03-12 11:59:59 +01:00
Makefile sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
membarrier.c RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
pelt.c sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
pelt.h sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() 2024-04-24 12:08:01 +02:00
psi.c sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
rt.c scheduler: Remove the now superfluous sentinel elements from ctl_table array 2024-04-24 09:43:54 +02:00
sched-pelt.h
sched.h sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
smp.h sched, smp: Trace smp callback causing an IPI 2023-03-24 11:01:29 +01:00
stats.c sched/debug: Increase SCHEDSTAT_VERSION to 16 2024-03-12 11:03:40 +01:00
stats.h sched: Move psi_account_irqtime() out of update_rq_clock_task() hotpath 2024-07-01 13:01:44 +02:00
stop_task.c sched: Unify runtime accounting across classes 2023-11-15 09:57:48 +01:00
swait.c sched: add a few helpers to wake up tasks on the current cpu 2023-07-17 16:08:08 -07:00
topology.c bitmap patches for 6.10 2024-05-21 15:29:01 -07:00
wait_bit.c wait_on_bit: add an acquire memory barrier 2022-08-26 09:30:25 -07:00
wait.c sched: remove wait bookmarks 2023-10-18 14:34:18 -07:00