linux/drivers
Chris Wilson 4a3859ea52
drm/i915/gt: Reset queue_priority_hint on parking
Originally, with strict in order execution, we could complete execution
only when the queue was empty. Preempt-to-busy allows replacement of an
active request that may complete before the preemption is processed by
HW. If that happens, the request is retired from the queue, but the
queue_priority_hint remains set, preventing direct submission until
after the next CS interrupt is processed.

This preempt-to-busy race can be triggered by the heartbeat, which will
also act as the power-management barrier and upon completion allow us to
idle the HW. We may process the completion of the heartbeat, and begin
parking the engine before the CS event that restores the
queue_priority_hint, causing us to fail the assertion that it is MIN.

<3>[  166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[  166.210781] Dumping ftrace buffer:
<0>[  166.210795] ---------------------------------
...
<0>[  167.302811] drm_fdin-1097      2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 }
<0>[  167.302861] drm_fdin-1097      2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646
<0>[  167.302928] drm_fdin-1097      2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0
<0>[  167.302992] drm_fdin-1097      2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659
<0>[  167.303044] drm_fdin-1097      2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40
<0>[  167.303095] drm_fdin-1097      2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 }
<0>[  167.303159] kworker/-89       11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2
<0>[  167.303208] kworker/-89       11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin
<0>[  167.303272] kworker/-89       11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2
<0>[  167.303321] kworker/-89       11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin
<0>[  167.303384] kworker/-89       11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660
<0>[  167.303434] kworker/-89       11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns }
<0>[  167.303484] kworker/-89       11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked
<0>[  167.303534]   <idle>-0         5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040
<0>[  167.303583] kworker/-89       11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns }
<0>[  167.303756] kworker/-89       11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns }
<0>[  167.303806] kworker/-89       11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[  167.303811] ---------------------------------
<4>[  167.304722] ------------[ cut here ]------------
<2>[  167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283!
<4>[  167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[  167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G        W          6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1
<4>[  167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
<4>[  167.304738] Workqueue: i915-unordered retire_work_handler [i915]
<4>[  167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915]
<4>[  167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48
<4>[  167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246
<4>[  167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006
<4>[  167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
<4>[  167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001
<4>[  167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000
<4>[  167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0
<4>[  167.304950] FS:  0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000
<4>[  167.304952] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0
<4>[  167.304955] PKRU: 55555554
<4>[  167.304957] Call Trace:
<4>[  167.304958]  <TASK>
<4>[  167.305573]  ____intel_wakeref_put_last+0x1d/0x80 [i915]
<4>[  167.305685]  i915_request_retire.part.0+0x34f/0x600 [i915]
<4>[  167.305800]  retire_requests+0x51/0x80 [i915]
<4>[  167.305892]  intel_gt_retire_requests_timeout+0x27f/0x700 [i915]
<4>[  167.305985]  process_scheduled_works+0x2db/0x530
<4>[  167.305990]  worker_thread+0x18c/0x350
<4>[  167.305993]  kthread+0xfe/0x130
<4>[  167.305997]  ret_from_fork+0x2c/0x50
<4>[  167.306001]  ret_from_fork_asm+0x1b/0x30
<4>[  167.306004]  </TASK>

It is necessary for the queue_priority_hint to be lower than the next
request submission upon waking up, as we rely on the hint to decide when
to kick the tasklet to submit that first request.

Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318135906.716055-2-janusz.krzysztofik@linux.intel.com
(cherry picked from commit 98850e96cf)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-28 12:16:16 -04:00
..
accel drm for 6.9: 2024-03-13 18:34:05 -07:00
accessibility Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
acpi RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
amba
android Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
ata ahci: asm1064: asm1166: don't limit reported ports 2024-03-19 12:06:54 +01:00
atm atm: fore200e: Convert to platform remove callback returning void 2024-03-07 20:36:32 -08:00
auxdisplay auxdisplay: img-ascii-lcd: Convert to platform remove callback returning void 2024-03-12 17:37:54 +02:00
base Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00
bcma
block block-6.9-20240322 2024-03-22 12:46:07 -07:00
bluetooth TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
bus Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
cache
cdrom cdrom: gdrom: Convert to platform remove callback returning void 2024-03-07 11:53:30 -07:00
cdx cdx: add MSI support for CDX bus 2024-03-07 21:52:03 +00:00
char Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
clk ARM: late SoC changes for 6.9 2024-03-19 11:57:26 -07:00
clocksource A set of updates for clocksource and clockevent drivers: 2024-03-23 14:42:45 -07:00
comedi Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
connector
counter
cpufreq RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
cpuidle RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
crypto This update includes the following changes: 2024-03-15 14:46:54 -07:00
cxl Tracing updates for 6.9: 2024-03-18 15:11:44 -07:00
dax libnvdimm updates for v6.9 2024-03-15 11:58:32 -07:00
dca
devfreq
dio dio: make dio_bus_type const 2024-03-07 20:37:04 +00:00
dma dmaengine updates for v6.9 2024-03-15 12:25:13 -07:00
dma-buf
dpll Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-03-11 20:38:36 -07:00
edac - Add a FRU (Field Replaceable Unit) memory poison manager which 2024-03-11 18:14:06 -07:00
eisa
extcon
firewire firewire: core: add memo about the caller of show functions for device attributes 2024-03-21 21:20:18 +09:00
firmware EFI fixes for v6.9 #2 2024-03-24 13:54:06 -07:00
fpga Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
fsi
gnss
gpio Pin control changes for the v6.9 kernel cycle: 2024-03-14 10:22:26 -07:00
gpu drm/i915/gt: Reset queue_priority_hint on parking 2024-03-28 12:16:16 -04:00
greybus Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
hid hid-for-linus-2024031301 2024-03-14 09:56:15 -07:00
hsi
hte
hv hyperv-next for v6.9 2024-03-21 10:01:02 -07:00
hwmon - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
hwspinlock
hwtracing
i2c i2c: muxes: pca954x: Allow sharing reset GPIO 2024-03-20 09:45:04 +01:00
i3c
idle
iio Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
infiniband RDMA v6.9 2024-03-18 15:34:03 -07:00
input TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
interconnect interconnect changes for 6.9 2024-03-06 14:03:31 +00:00
iommu dma-mapping fixes for Linux 6.9 2024-03-24 10:45:31 -07:00
ipack ipack: make ipack_bus_type const 2024-03-07 20:32:47 +00:00
irqchip irqchip/renesas-rzg2l: Do not set TIEN and TINT source at the same time 2024-03-18 21:09:02 +01:00
isdn isdn: capi: make capi_class constant 2024-03-07 20:26:24 -08:00
leds - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min 2024-03-14 18:03:09 -07:00
macintosh powerpc updates for 6.9 2024-03-15 17:53:48 -07:00
mailbox imx: add support for i.MX95 ELE/V2X MU 2024-03-13 12:23:36 -07:00
mcb mcb: constify the struct device_type usage 2024-03-07 20:38:15 +00:00
md - Fix a memory leak in DM integrity recheck code that was added during 2024-03-22 12:34:26 -07:00
media Linux 6.8 2024-03-18 17:30:46 +00:00
memory Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
memstick MMC core: 2024-03-13 10:59:28 -07:00
message
mfd TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
misc hardening fixes for v6.9-rc1 2024-03-23 08:43:21 -07:00
mmc Linux 6.8 2024-03-18 17:30:46 +00:00
most most: core: make mostbus const 2024-03-07 20:32:38 +00:00
mtd This pull request contains updates for UBI and UBIFS: 2024-03-21 15:09:29 -07:00
mux
net hardening fixes for v6.9-rc1 2024-03-23 08:43:21 -07:00
nfc
ntb
nubus
nvdimm libnvdimm updates for v6.9 2024-03-15 11:58:32 -07:00
nvme nvme updates for Linux 6.9 2024-03-21 13:23:07 -06:00
nvmem nvmem: core: Print error on wrong bits DT property 2024-03-07 20:21:53 +00:00
of Driver core changes for 6.9-rc1 2024-03-21 13:34:15 -07:00
opp OPP: Extend dev_pm_opp_data with turbo support 2024-03-11 10:39:24 +05:30
parisc parisc: led: Convert to platform remove callback returning void 2024-03-08 10:00:07 +01:00
parport parport: sunbpp: Convert to platform remove callback returning void 2024-03-07 21:50:06 +00:00
pci pci-v6.9-changes 2024-03-14 10:58:27 -07:00
pcmcia pcmcia: cs: make pcmcia_socket_class constant 2024-03-10 09:07:00 +01:00
peci
perf RISC-V Patches for the 6.9 Merge Window 2024-03-22 10:41:13 -07:00
phy USB/Thunderbolt changes for 6.9-rc1 2024-03-21 12:35:20 -07:00
pinctrl phy-for-6.9 2024-03-16 11:24:51 -07:00
platform Char/Misc and other driver subsystem updates for 6.9-rc1 2024-03-21 13:21:31 -07:00
pmdomain Core: 2024-03-13 11:33:10 -07:00
pnp
power power supply and reset changes for the 6.9 series 2024-03-14 10:19:48 -07:00
powercap powercap: intel_rapl: Convert to platform remove callback returning void 2024-03-13 20:45:54 +01:00
pps pps: use cflags-y instead of EXTRA_CFLAGS 2024-03-07 21:51:39 +00:00
ps3
ptp Networking changes for 6.9. 2024-03-12 17:44:08 -07:00
pwm
rapidio
ras - Add a FRU (Field Replaceable Unit) memory poison manager which 2024-03-11 18:14:06 -07:00
regulator regulator: Fix for v6.9 2024-03-22 09:52:37 -07:00
remoteproc remoteproc updates for v6.9 2024-03-21 10:37:39 -07:00
reset
rpmsg
rtc RTC for 6.9 2024-03-21 17:16:46 -07:00
s390 more s390 updates for 6.9 merge window 2024-03-19 11:38:27 -07:00
sbus This includes the following changes related to sparc for v6.9: 2024-03-15 12:47:21 -07:00
scsi SCSI misc on 20240322 2024-03-22 13:31:07 -07:00
sh
siox SIOX changes for 6.9-rc1 2024-03-21 15:18:18 -07:00
slimbus slimbus: core: make slimbus_bus const 2024-03-07 20:21:39 +00:00
soc Including fixes from CAN, netfilter, wireguard and IPsec. 2024-03-21 14:50:39 -07:00
soundwire soundwire updates for 6.9 2024-03-15 12:22:52 -07:00
spi spi: Fixes for v6.9 2024-03-22 09:57:00 -07:00
spmi
ssb
staging Staging driver cleanups for 6.9-rc1 2024-03-21 13:03:44 -07:00
target SCSI misc on 20240316 2024-03-16 16:31:12 -07:00
tc
tee ARM: SoC drivers for 6.9 2024-03-12 10:35:24 -07:00
thermal - Fix memory leak in the error path at probe time in the Mediatek LVTS 2024-03-13 20:35:48 +01:00
thunderbolt USB/Thunderbolt changes for 6.9-rc1 2024-03-21 12:35:20 -07:00
tty TTY/Serial driver update for 6.9-rc1 2024-03-21 12:44:10 -07:00
ufs SCSI misc on 20240316 2024-03-16 16:31:12 -07:00
uio uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion 2024-03-07 21:52:59 +00:00
usb USB/Thunderbolt changes for 6.9-rc1 2024-03-21 12:35:20 -07:00
vdpa vDPA: report virtio-blk flush info to user space 2024-03-19 02:45:51 -04:00
vfio VFIO updates for v6.9-rc1 2024-03-15 13:21:13 -07:00
vhost virtio: features, fixes 2024-03-19 08:57:39 -07:00
video fbdev fixes and cleanups for 6.9-rc1: 2024-03-22 10:09:08 -07:00
virt virt: efi_secret: Convert to platform remove callback returning void 2024-03-09 11:37:18 +01:00
virtio virtio: packed: fix unmap leak for indirect desc table 2024-03-19 03:19:22 -04:00
w1
watchdog linux-watchdog 6.9-rc1 tag 2024-03-17 12:06:10 -07:00
xen xen: branch for v6.9-rc1 2024-03-19 08:48:09 -07:00
zorro
Kconfig
Makefile Revert "leds: Only descend into leds directory when CONFIG_NEW_LEDS is set" 2024-03-07 08:48:10 +00:00