linux/kernel/sched
Qais Yousef 0c18f2ecfc sched/uclamp: Fix wrong implementation of cpu.uclamp.min
cpu.uclamp.min is a protection as described in cgroup-v2 Resource
Distribution Model

	Documentation/admin-guide/cgroup-v2.rst

which means we try our best to preserve the minimum performance point of
tasks in this group. See full description of cpu.uclamp.min in the
cgroup-v2.rst.

But the current implementation makes it a limit, which is not what was
intended.

For example:

	tg->cpu.uclamp.min = 20%

	p0->uclamp[UCLAMP_MIN] = 0
	p1->uclamp[UCLAMP_MIN] = 50%

	Previous Behavior (limit):

		p0->effective_uclamp = 0
		p1->effective_uclamp = 20%

	New Behavior (Protection):

		p0->effective_uclamp = 20%
		p1->effective_uclamp = 50%

Which is inline with how protections should work.

With this change the cgroup and per-task behaviors are the same, as
expected.

Additionally, we remove the confusing relationship between cgroup and
!user_defined flag.

We don't want for example RT tasks that are boosted by default to max to
change their boost value when they attach to a cgroup. If a cgroup wants
to limit the max performance point of tasks attached to it, then
cpu.uclamp.max must be set accordingly.

Or if they want to set different boost value based on cgroup, then
sysctl_sched_util_clamp_min_rt_default must be used to NOT boost to max
and set the right cpu.uclamp.min for each group to let the RT tasks
obtain the desired boost value when attached to that group.

As it stands the dependency on !user_defined flag adds an extra layer of
complexity that is not required now cpu.uclamp.min behaves properly as
a protection.

The propagation model of effective cpu.uclamp.min in child cgroups as
implemented by cpu_util_update_eff() is still correct. The parent
protection sets an upper limit of what the child cgroups will
effectively get.

Fixes: 3eac870a32 (sched/uclamp: Use TG's clamps to restrict TASK's clamps)
Signed-off-by: Qais Yousef <qais.yousef@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210510145032.1934078-2-qais.yousef@arm.com
2021-05-19 10:53:02 +02:00
..
autogroup.c
autogroup.h
clock.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
completion.c completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() 2020-03-23 18:40:25 +01:00
core_sched.c sched: prctl() core-scheduling interface 2021-05-12 11:43:31 +02:00
core.c sched/uclamp: Fix wrong implementation of cpu.uclamp.min 2021-05-19 10:53:02 +02:00
cpuacct.c sched: Wrap rq::lock access 2021-05-12 11:43:26 +02:00
cpudeadline.c sched,rt: Use the full cpumask for balancing 2020-11-10 18:39:00 +01:00
cpudeadline.h
cpufreq_schedutil.c Scheduler updates for this cycle are: 2021-04-28 13:33:57 -07:00
cpufreq.c cpufreq: Avoid leaving stale IRQ work items during CPU offline 2019-12-12 17:59:43 +01:00
cpupri.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
cpupri.h sched/cpupri: Add CPUPRI_HIGHER 2020-10-29 11:00:30 +01:00
cputime.c Scheduler updates for this cycle are: 2021-04-28 13:33:57 -07:00
deadline.c sched: Introduce sched_class::pick_task() 2021-05-12 11:43:28 +02:00
debug.c sched: Wrap rq::lock access 2021-05-12 11:43:26 +02:00
fair.c sched: Fix leftover comment typos 2021-05-12 19:54:49 +02:00
features.h sched: Warn on long periods of pending need_resched 2021-04-21 13:55:41 +02:00
idle.c sched: Trivial forced-newidle balancer 2021-05-12 11:43:30 +02:00
isolation.c sched/isolation: Reconcile rcu_nocbs= and nohz_full= 2021-05-13 14:12:47 +02:00
loadavg.c sched: Make multiple runqueue task counters 32-bit 2021-05-12 21:34:17 +02:00
Makefile sched: Trivial core scheduling cookie management 2021-05-12 11:43:31 +02:00
membarrier.c sched/membarrier: fix missing local execution of ipi_sync_rq_state() 2021-03-06 12:40:21 +01:00
pelt.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
pelt.h sched: Wrap rq::lock access 2021-05-12 11:43:26 +02:00
psi.c psi: Fix psi state corruption when schedule() races with cgroup move 2021-05-06 15:33:26 +02:00
rt.c sched: Introduce sched_class::pick_task() 2021-05-12 11:43:28 +02:00
sched-pelt.h
sched.h sched: Make multiple runqueue task counters 32-bit 2021-05-12 21:34:17 +02:00
smp.h sched/headers: Split out open-coded prototypes into kernel/sched/smp.h 2020-05-28 11:03:20 +02:00
stats.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
stats.h sched,stats: Further simplify sched_info 2021-05-18 12:53:53 +02:00
stop_task.c sched: Introduce sched_class::pick_task() 2021-05-12 11:43:28 +02:00
swait.c sched/swait: Prepare usage in completions 2020-03-21 16:00:23 +01:00
topology.c sched: Wrap rq::lock access 2021-05-12 11:43:26 +02:00
wait_bit.c sched/wait: fix ___wait_var_event(exclusive) 2019-12-17 13:32:50 +01:00
wait.c sched/wait: Add add_wait_queue_priority() 2020-11-15 09:49:09 -05:00