linux/kernel/sched
Valentin Schneider 620a6dc407 sched/topology: Make sched_init_numa() use a set for the deduplicating sort
The deduplicating sort in sched_init_numa() assumes that the first line in
the distance table contains all unique values in the entire table. I've
been trying to pen what this exactly means for the topology, but it's not
straightforward. For instance, topology.c uses this example:

  node   0   1   2   3
    0:  10  20  20  30
    1:  20  10  20  20
    2:  20  20  10  20
    3:  30  20  20  10

  0 ----- 1
  |     / |
  |   /   |
  | /     |
  2 ----- 3

Which works out just fine. However, if we swap nodes 0 and 1:

  1 ----- 0
  |     / |
  |   /   |
  | /     |
  2 ----- 3

we get this distance table:

  node   0  1  2  3
    0:  10 20 20 20
    1:  20 10 20 30
    2:  20 20 10 20
    3:  20 30 20 10

Which breaks the deduplicating sort (non-representative first line). In
this case this would just be a renumbering exercise, but it so happens that
we can have a deduplicating sort that goes through the whole table in O(n²)
at the extra cost of a temporary memory allocation (i.e. any form of set).

The ACPI spec (SLIT) mentions distances are encoded on 8 bits. Following
this, implement the set as a 256-bits bitmap. Should this not be
satisfactory (i.e. we want to support 32-bit values), then we'll have to go
for some other sparse set implementation.

This has the added benefit of letting us allocate just the right amount of
memory for sched_domains_numa_distance[], rather than an arbitrary
(nr_node_ids + 1).

Note: DT binding equivalent (distance-map) decodes distances as 32-bit
values.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210122123943.1217-2-valentin.schneider@arm.com
2021-01-27 17:26:42 +01:00
..
autogroup.c
autogroup.h
clock.c
completion.c
core.c sched: Prevent raising SCHED_SOFTIRQ when CPU is !active 2021-01-14 11:20:09 +01:00
cpuacct.c
cpudeadline.c sched,rt: Use the full cpumask for balancing 2020-11-10 18:39:00 +01:00
cpudeadline.h
cpufreq_schedutil.c sched/core: Rename schedutil_cpu_util() and allow rest of the kernel to use it 2021-01-14 11:20:09 +01:00
cpufreq.c
cpupri.c Merge branch 'sched/migrate-disable' 2020-11-10 18:39:04 +01:00
cpupri.h sched/cpupri: Add CPUPRI_HIGHER 2020-10-29 11:00:30 +01:00
cputime.c irqtime: Move irqtime entry accounting after irq offset incrementation 2020-12-02 20:20:05 +01:00
deadline.c sched: Use task_current() instead of 'rq->curr == p' 2021-01-14 11:20:11 +01:00
debug.c sched: Use task_current() instead of 'rq->curr == p' 2021-01-14 11:20:11 +01:00
fair.c sched/eas: Don't update misfit status if the task is pinned 2021-01-27 17:26:42 +01:00
features.h sched/rt: Disable RT_RUNTIME_SHARE by default 2020-09-25 14:23:24 +02:00
idle.c Scheduler updates: 2020-12-14 18:29:11 -08:00
isolation.c isolcpus: Affine unbound kernel threads to housekeeping cpus 2020-06-15 14:10:03 +02:00
loadavg.c sched: nohz: stop passing around unused "ticks" parameter. 2020-07-22 10:22:04 +02:00
Makefile
membarrier.c Scheduler updates: 2020-12-14 18:29:11 -08:00
pelt.c sched: Add a tracepoint to track rq->nr_running 2020-07-08 11:39:02 +02:00
pelt.h sched/pelt: Cleanup PELT divider 2020-06-15 14:10:06 +02:00
psi.c sched,psi: Convert to sched_set_fifo_low() 2020-06-15 14:10:25 +02:00
rt.c sched: Use task_current() instead of 'rq->curr == p' 2021-01-14 11:20:11 +01:00
sched-pelt.h
sched.h sched/core: Rename schedutil_cpu_util() and allow rest of the kernel to use it 2021-01-14 11:20:09 +01:00
smp.h sched/headers: Split out open-coded prototypes into kernel/sched/smp.h 2020-05-28 11:03:20 +02:00
stats.c
stats.h
stop_task.c sched: Remove select_task_rq()'s sd_flag parameter 2020-11-10 18:39:06 +01:00
swait.c
topology.c sched/topology: Make sched_init_numa() use a set for the deduplicating sort 2021-01-27 17:26:42 +01:00
wait_bit.c
wait.c sched/wait: Add add_wait_queue_priority() 2020-11-15 09:49:09 -05:00