linux/include
Tejun Heo e32c260195 sched_ext: Enable the ops breather and eject BPF scheduler on softlockup
On 2 x Intel Sapphire Rapids machines with 224 logical CPUs, a poorly
behaving BPF scheduler can live-lock the system by making multiple CPUs bang
on the same DSQ to the point where soft-lockup detection triggers before
SCX's own watchdog can take action. It also seems possible that the machine
can be live-locked enough to prevent scx_ops_helper, which is an RT task,
from running in a timely manner.

Implement scx_softlockup() which is called when three quarters of
soft-lockup threshold has passed. The function immediately enables the ops
breather and triggers an ops error to initiate ejection of the BPF
scheduler.

The previous and this patch combined enable the kernel to reliably recover
the system from live-lock conditions that can be triggered by a poorly
behaving BPF scheduler on Intel dual socket systems.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
2024-11-08 10:42:22 -10:00
..
acpi Power management updates for 6.12-rc1 2024-09-16 07:47:50 +02:00
asm-generic sched_ext: Initial pull request for v6.12 2024-09-21 09:44:57 -07:00
clocksource
crypto
drm drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
dt-bindings The core clk framework is left largely untouched this time around except for 2024-09-23 15:01:48 -07:00
keys KEYS: Remove unused declarations 2024-09-20 18:28:26 +03:00
kunit The core clk framework is left largely untouched this time around except for 2024-09-23 15:01:48 -07:00
kvm
linux sched_ext: Enable the ops breather and eject BPF scheduler on softlockup 2024-11-08 10:42:22 -10:00
math-emu
media media: cec: move cec_get/put_device to header 2024-09-05 20:12:15 +02:00
memory
misc
net bluetooth-next pull request for net-next: 2024-09-13 19:50:25 -07:00
pcmcia
ras
rdma RDMA/nldev: Add support for RDMA monitoring 2024-09-13 08:29:14 +03:00
rv
scsi SCSI misc on 20240919 2024-09-19 11:28:51 +02:00
soc soc: driver updates for 6.12 2024-09-17 10:48:09 +02:00
sound ASoC: Updates for v6.12 2024-09-14 09:09:59 +02:00
target
trace firewire updates for v6.12 2024-09-23 12:55:27 -07:00
uapi iommufd 6.12 merge window pull 2024-09-24 11:55:26 -07:00
ufs Many singleton patches - please see the various changelogs for details. 2024-09-21 08:20:50 -07:00
vdso random: vDSO: add a __vdso_getrandom prototype for all architectures 2024-09-13 17:28:35 +02:00
video
xen