linux/drivers
Chris Wilson 3fc28d3e0e drm/i915/gt: Reset queue_priority_hint after wedging
An odd and highly unlikely path caught us out. On delayed submission
(due to an asynchronous reset handler), we poked the priority_hint and
kicked the tasklet. However, we had already marked the device as wedged
and swapped out the tasklet for a no-op. The result was that we never
cleared the priority hint and became upset when we later checked.

<0> [574.303565] i915_sel-6278    2.... 481822445us : __i915_subtests: Running intel_execlists_live_selftests/live_error_interrupt
<0> [574.303565] i915_sel-6278    2.... 481822472us : __engine_unpark: 0000:00:02.0 rcs0:
<0> [574.303565] i915_sel-6278    2.... 481822491us : __gt_unpark: 0000:00:02.0
<0> [574.303565] i915_sel-6278    2.... 481823220us : execlists_context_reset: 0000:00:02.0 rcs0: context:f4ee reset
<0> [574.303565] i915_sel-6278    2.... 481824830us : __intel_context_active: 0000:00:02.0 rcs0: context:f51b active
<0> [574.303565] i915_sel-6278    2.... 481825258us : __intel_context_do_pin: 0000:00:02.0 rcs0: context:f51b pin ring:{start:00006000, head:0000, tail:0000}
<0> [574.303565] i915_sel-6278    2.... 481825311us : __i915_request_commit: 0000:00:02.0 rcs0: fence f51b:2, current 0
<0> [574.303565] i915_sel-6278    2d..1 481825347us : __i915_request_submit: 0000:00:02.0 rcs0: fence f51b:2, current 0
<0> [574.303565] i915_sel-6278    2d..1 481825363us : trace_ports: 0000:00:02.0 rcs0: submit { f51b:2, 0:0 }
<0> [574.303565] i915_sel-6278    2.... 481826809us : __intel_context_active: 0000:00:02.0 rcs0: context:f51c active
<0> [574.303565]   <idle>-0       7d.h2 481827326us : cs_irq_handler: 0000:00:02.0 rcs0: CS error: 1
<0> [574.303565]   <idle>-0       7..s1 481827377us : process_csb: 0000:00:02.0 rcs0: cs-irq head=3, tail=4
<0> [574.303565]   <idle>-0       7..s1 481827379us : process_csb: 0000:00:02.0 rcs0: csb[4]: status=0x10000001:0x00000000
<0> [574.305593]   <idle>-0       7..s1 481827385us : trace_ports: 0000:00:02.0 rcs0: promote { f51b:2*, 0:0 }
<0> [574.305611]   <idle>-0       7..s1 481828179us : execlists_reset: 0000:00:02.0 rcs0: reset for CS error
<0> [574.305611] i915_sel-6278    2.... 481828284us : __intel_context_do_pin: 0000:00:02.0 rcs0: context:f51c pin ring:{start:00007000, head:0000, tail:0000}
<0> [574.305611] i915_sel-6278    2.... 481828345us : __i915_request_commit: 0000:00:02.0 rcs0: fence f51c:2, current 0
<0> [574.305611]   <idle>-0       7dNs2 481847823us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence f51b:2, current 1
<0> [574.305611]   <idle>-0       7dNs2 481847857us : execlists_hold: 0000:00:02.0 rcs0: fence f51b:2, current 1 on hold
<0> [574.305611]   <idle>-0       7.Ns1 481847863us : intel_engine_reset: 0000:00:02.0 rcs0: flags=4
<0> [574.305611]   <idle>-0       7.Ns1 481847945us : execlists_reset_prepare: 0000:00:02.0 rcs0: depth<-1
<0> [574.305611]   <idle>-0       7.Ns1 481847946us : intel_engine_stop_cs: 0000:00:02.0 rcs0:
<0> [574.305611]   <idle>-0       7.Ns1 538584284us : intel_engine_stop_cs: 0000:00:02.0 rcs0: timed out on STOP_RING -> IDLE
<0> [574.305611]   <idle>-0       7.Ns1 538584347us : __intel_gt_reset: 0000:00:02.0 engine_mask=1
<0> [574.305611]   <idle>-0       7.Ns1 538584406us : execlists_reset_rewind: 0000:00:02.0 rcs0:
<0> [574.305611]   <idle>-0       7dNs2 538585050us : __i915_request_reset: 0000:00:02.0 rcs0: fence f51b:2, current 1 guilty? yes
<0> [574.305611]   <idle>-0       7dNs2 538585063us : __execlists_reset: 0000:00:02.0 rcs0: replay {head:0000, tail:0068}
<0> [574.306565]   <idle>-0       7.Ns1 538588457us : intel_engine_cancel_stop_cs: 0000:00:02.0 rcs0:
<0> [574.306565]   <idle>-0       7dNs2 538588462us : __i915_request_submit: 0000:00:02.0 rcs0: fence f51c:2, current 0
<0> [574.306565]   <idle>-0       7dNs2 538588471us : trace_ports: 0000:00:02.0 rcs0: submit { f51c:2, 0:0 }
<0> [574.306565]   <idle>-0       7.Ns1 538588474us : execlists_reset_finish: 0000:00:02.0 rcs0: depth->1
<0> [574.306565] kworker/-202     2.... 538588755us : i915_request_retire: 0000:00:02.0 rcs0: fence f51c:2, current 2
<0> [574.306565] ksoftirq-46      7..s. 538588773us : process_csb: 0000:00:02.0 rcs0: cs-irq head=11, tail=1
<0> [574.306565] ksoftirq-46      7..s. 538588774us : process_csb: 0000:00:02.0 rcs0: csb[0]: status=0x10000001:0x00000000
<0> [574.306565] ksoftirq-46      7..s. 538588776us : trace_ports: 0000:00:02.0 rcs0: promote { f51c:2!, 0:0 }
<0> [574.306565] ksoftirq-46      7..s. 538588778us : process_csb: 0000:00:02.0 rcs0: csb[1]: status=0x10000018:0x00000020
<0> [574.306565] ksoftirq-46      7..s. 538588779us : trace_ports: 0000:00:02.0 rcs0: completed { f51c:2!, 0:0 }
<0> [574.306565] kworker/-202     2.... 538588826us : intel_context_unpin: 0000:00:02.0 rcs0: context:f51c unpin
<0> [574.306565] i915_sel-6278    6.... 538589663us : __intel_gt_set_wedged.part.32: 0000:00:02.0 start
<0> [574.306565] i915_sel-6278    6.... 538589667us : execlists_reset_prepare: 0000:00:02.0 rcs0: depth<-0
<0> [574.306565] i915_sel-6278    6.... 538589710us : intel_engine_stop_cs: 0000:00:02.0 rcs0:
<0> [574.306565] i915_sel-6278    6.... 538589732us : execlists_reset_prepare: 0000:00:02.0 bcs0: depth<-0
<0> [574.307591] i915_sel-6278    6.... 538589733us : intel_engine_stop_cs: 0000:00:02.0 bcs0:
<0> [574.307591] i915_sel-6278    6.... 538589757us : execlists_reset_prepare: 0000:00:02.0 vcs0: depth<-0
<0> [574.307591] i915_sel-6278    6.... 538589758us : intel_engine_stop_cs: 0000:00:02.0 vcs0:
<0> [574.307591] i915_sel-6278    6.... 538589771us : execlists_reset_prepare: 0000:00:02.0 vcs1: depth<-0
<0> [574.307591] i915_sel-6278    6.... 538589772us : intel_engine_stop_cs: 0000:00:02.0 vcs1:
<0> [574.307591] i915_sel-6278    6.... 538589778us : execlists_reset_prepare: 0000:00:02.0 vecs0: depth<-0
<0> [574.307591] i915_sel-6278    6.... 538589780us : intel_engine_stop_cs: 0000:00:02.0 vecs0:
<0> [574.307591] i915_sel-6278    6.... 538589786us : __intel_gt_reset: 0000:00:02.0 engine_mask=ff
<0> [574.307591] i915_sel-6278    6.... 538591175us : execlists_reset_cancel: 0000:00:02.0 rcs0:
<0> [574.307591] i915_sel-6278    6.... 538591970us : execlists_reset_cancel: 0000:00:02.0 bcs0:
<0> [574.307591] i915_sel-6278    6.... 538591982us : execlists_reset_cancel: 0000:00:02.0 vcs0:
<0> [574.307591] i915_sel-6278    6.... 538591996us : execlists_reset_cancel: 0000:00:02.0 vcs1:
<0> [574.307591] i915_sel-6278    6.... 538592759us : execlists_reset_cancel: 0000:00:02.0 vecs0:
<0> [574.307591] i915_sel-6278    6.... 538592977us : execlists_reset_finish: 0000:00:02.0 rcs0: depth->0
<0> [574.307591] i915_sel-6278    6.N.. 538592996us : execlists_reset_finish: 0000:00:02.0 bcs0: depth->0
<0> [574.307591] i915_sel-6278    6.N.. 538593023us : execlists_reset_finish: 0000:00:02.0 vcs0: depth->0
<0> [574.307591] i915_sel-6278    6.N.. 538593037us : execlists_reset_finish: 0000:00:02.0 vcs1: depth->0
<0> [574.307591] i915_sel-6278    6.N.. 538593051us : execlists_reset_finish: 0000:00:02.0 vecs0: depth->0
<0> [574.307591] i915_sel-6278    6.... 538593407us : __intel_gt_set_wedged.part.32: 0000:00:02.0 end
<0> [574.307591] kworker/-210     7d..1 551958381us : execlists_unhold: 0000:00:02.0 rcs0: fence f51b:2, current 2 hold release
<0> [574.307591] i915_sel-6278    0.... 559490788us : i915_request_retire: 0000:00:02.0 rcs0: fence f51b:2, current 2
<0> [574.307591] i915_sel-6278    0.... 559490793us : intel_context_unpin: 0000:00:02.0 rcs0: context:f51b unpin
<0> [574.307591] i915_sel-6278    0.... 559490798us : __engine_park: 0000:00:02.0 rcs0: parked
<0> [574.307591] i915_sel-6278    0.... 559490982us : __intel_context_retire: 0000:00:02.0 rcs0: context:f51c retire runtime: { total:30004ns, avg:30004ns }
<0> [574.307591] i915_sel-6278    0.... 559491372us : __engine_park: __engine_park:261 GEM_BUG_ON(engine->execlists.queue_priority_hint != (-((int)(~0U >> 1)) - 1))

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200227085723.1961649-9-chris@chris-wilson.co.uk
2020-02-28 15:48:10 +00:00
..
accessibility
acpi ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system 2020-02-11 23:26:15 +01:00
amba
android for-5.6/io_uring-vfs-2020-01-29 2020-01-29 18:53:37 -08:00
ata libata-5.6-2020-02-05 2020-02-06 06:11:50 +00:00
atm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
auxdisplay
base ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
bcma Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
block Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:26:41 -08:00
bluetooth Bluetooth: btrtl: Use kvmalloc for FW allocations 2020-01-24 19:57:53 +01:00
bus bus: moxtet: fix potential stack buffer overflow 2020-02-15 10:33:19 -08:00
cdrom
char Minor bug fixes for IPMI 2020-02-16 13:05:46 -08:00
clk ARM: SoC: late updates 2020-02-08 14:17:27 -08:00
clocksource ARM: SoC: late updates 2020-02-08 14:17:27 -08:00
connector
counter
cpufreq Merge branch 'pm-cpufreq' 2020-02-14 10:40:48 +01:00
cpuidle ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
crypto Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-01-28 16:02:33 -08:00
dax dax: Get rid of fs_dax_get_by_host() helper 2020-01-16 09:52:27 -08:00
dca
devfreq PM / devfreq: Add debugfs support with devfreq_summary file 2020-01-16 19:14:49 +09:00
dio
dma ARM: Device-tree updates 2020-02-08 13:58:44 -08:00
dma-buf
edac EDAC/sysfs: Remove csrow objects on errors 2020-02-13 13:29:41 +01:00
eisa
extcon
firewire
firmware ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
fpga fpga: xilinx-pr-decoupler: Remove clk_get error message for probe defer 2020-01-10 12:51:56 -08:00
fsi
gnss
gpio gpio: sifive: fix static checker warning 2020-02-10 13:54:17 +01:00
gpu drm/i915/gt: Reset queue_priority_hint after wedging 2020-02-28 15:48:10 +00:00
greybus
hid drm pull for 5.6-rc1 2020-01-30 08:04:01 -08:00
hsi
hv - Most of the commits here are work to enable host-initiated hibernation 2020-02-03 14:42:03 +00:00
hwmon hwmon: (pmbus/xdpe12284) fix typo in compatible strings 2020-02-12 12:43:01 -08:00
hwspinlock hwspinlock: sirf: Use devm_hwspin_lock_register() to register hwlock controller 2020-01-21 16:16:36 -08:00
hwtracing coresight: etm4x: Fix unused function warning 2020-01-14 15:38:28 +01:00
i2c Merge branch 'i2c/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2020-02-07 12:54:13 -08:00
i3c i3c: master: dw: reattach device on first available location of address table 2020-01-13 10:00:05 +01:00
ide proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
idle intel_idle: Introduce 'states_off' module parameter 2020-02-03 11:57:18 +01:00
iio chrome platform changes for 5.6 2020-02-04 07:17:41 +00:00
infiniband IB/mlx5: Use div64_u64 for num_var_hw_entries calculation 2020-02-14 15:21:52 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2020-02-15 16:49:25 -08:00
interconnect
iommu IOMMU Updates for Linux v5.6 2020-02-05 17:49:54 +00:00
ipack
irqchip irqchip/gic-v4.1: Avoid 64bit division for the sake of 32bit ARM 2020-02-09 15:47:37 -08:00
isdn proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
leds leds: lm3532: add pointer to documentation and fix typo 2020-01-22 21:08:24 +01:00
lightnvm
macintosh powerpc updates for 5.6 2020-02-04 13:06:46 +00:00
mailbox
mcb
md block-5.6-2020-02-16 2020-02-16 12:35:52 -08:00
media chrome platform changes for 5.6 2020-02-04 07:17:41 +00:00
memory mvebu drivers for 5.6 (part 1) 2020-01-16 10:45:44 -08:00
memstick
message Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
mfd chrome platform changes for 5.6 2020-02-04 07:17:41 +00:00
misc Merge branch 'i2c/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2020-02-07 12:54:13 -08:00
mmc ioremap changes for 5.6 2020-01-27 13:03:00 -08:00
mtd treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
mux
net net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples() 2020-02-14 07:05:17 -08:00
nfc Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
ntb
nubus
nvdimm mm: Cleanup __put_devmap_managed_page() vs ->page_free() 2020-01-31 10:30:37 -08:00
nvme block-5.6-2020-02-16 2020-02-16 12:35:52 -08:00
nvmem Merge branch 'i2c/for-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2020-02-07 12:54:13 -08:00
of ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
opp ioremap changes for 5.6 2020-01-27 13:03:00 -08:00
oprofile tracing: Make struct ring_buffer less ambiguous 2020-01-13 13:19:38 -05:00
parisc proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
parport
pci pci-v5.6-fixes-1 2020-02-06 14:17:38 +00:00
pcmcia
perf perf/smmuv3: Use platform_get_irq_optional() for wired interrupt 2020-02-10 18:14:46 +00:00
phy treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
pinctrl pinctrl: fix pxa2xx.c build warnings 2020-02-04 03:05:24 +00:00
platform Merge branch 'akpm' (patches from Andrew) 2020-02-04 07:24:48 +00:00
pnp proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
power ARM: SoC platform updates 2020-02-08 13:55:25 -08:00
powercap Merge back power capping changes for v5.6. 2020-01-13 10:32:19 +01:00
pps
ps3
ptp Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
pwm pwm: Remove set but not set variable 'pwm' 2020-01-20 15:40:49 +01:00
rapidio
ras
regulator - New Drivers 2020-02-03 14:51:57 +00:00
remoteproc remoteproc: qcom: q6v5-mss: Improve readability of reset_assert 2020-01-24 09:34:07 -08:00
reset Reset controller updates for v5.6 2020-01-10 22:20:37 -08:00
rpmsg rpmsg: add rpmsg support for mt8183 SCP. 2020-01-20 10:29:56 -08:00
rtc chrome platform changes for 5.6 2020-02-04 07:17:41 +00:00
s390 fix style of SPDX License Identifier 2020-02-11 20:14:28 +01:00
sbus
scsi SCSI misc on 20200208 2020-02-08 17:24:41 -08:00
sfi
sh
siox siox: Use the correct style for SPDX License Identifier 2020-01-14 21:46:53 +01:00
slimbus slimbus: qcom: add missed clk_disable_unprepare in remove 2020-01-14 21:46:48 +01:00
soc soc/tegra: fuse: Fix build with Tegra194 configuration 2020-02-11 15:00:15 -08:00
soundwire soundwire: cadence: fix kernel-doc parameter descriptions 2020-01-16 17:34:38 +05:30
spi treewide: remove redundant IS_ERR() before error code check 2020-02-04 03:05:27 +00:00
spmi spmi: pmic-arb: Set lockdep class for hierarchical irq domains 2020-02-10 13:16:04 +01:00
ssb
staging proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
target SCSI misc on 20200129 2020-01-29 18:16:16 -08:00
tc The main MIPS changes for 5.6: 2020-01-31 11:28:31 -08:00
tee ARM: SoC-related driver updates 2020-02-08 14:04:19 -08:00
thermal - Fix a SEVERE docs build failure for cpu idle cooling device (Randy Dunlap) 2020-01-31 14:39:21 -08:00
thunderbolt thunderbolt: fix memory leak of object sw 2020-01-14 15:37:41 +01:00
tty Kbuild updates for v5.6 (2nd) 2020-02-09 16:05:50 -08:00
uio uio: uio_pdrv_genirq: Do not log an error when deferring probe routine. 2020-01-14 15:27:51 +01:00
usb Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:26:41 -08:00
vfio VFIO updates for v5.6-rc1 2020-02-03 22:22:05 +00:00
vhost
video drm-misc-next for 5.7: 2020-02-21 05:44:40 +10:00
virt
virtio virtio_balloon: Fix memory leaks on errors in virtballoon_probe() 2020-02-06 03:40:27 -05:00
visorbus visorbus: fix uninitialized variable access 2020-01-14 15:30:35 +01:00
vlynq
vme Char/Misc driver changes for 5.6-rc1 2020-01-29 10:35:54 -08:00
w1 Char/Misc driver changes for 5.6-rc1 2020-01-29 10:35:54 -08:00
watchdog linux-watchdog 5.6-rc1 tag 2020-02-07 12:30:16 -08:00
xen xen: branch for v5.6-rc1 2020-02-05 17:44:14 +00:00
zorro Kbuild updates for v5.6 (2nd) 2020-02-09 16:05:50 -08:00
Kconfig
Makefile