linux/drivers/pci
Alex Deucher 3f1271b54e PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken
There are enough VBIOS escapes without the proper workaround that some
users still hit this.  Microsoft never productized ATS on Windows so OEM
platforms that were Windows-only didn't always validate ATS.

The advantages of ATS are not worth it compared to the potential
instabilities on harvested boards.  Disable ATS on all Navi10 and Navi14
boards.

Symptoms include:

  amdgpu 0000:07:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0007 address=0xffffc02000 flags=0x0000]
  AMD-Vi: Event logged [IO_PAGE_FAULT device=07:00.0 domain=0x0007 address=0xffffc02000 flags=0x0000]
  [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=6047, emitted seq=6049
  amdgpu 0000:07:00.0: amdgpu: GPU reset begin!
  amdgpu 0000:07:00.0: amdgpu: GPU reset succeeded, trying to resume
  amdgpu 0000:07:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring sdma0 test failed (-110)
  [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <sdma_v4_0> failed -110
  amdgpu 0000:07:00.0: amdgpu: GPU reset(1) failed

Related commits:

  e8946a53e2 ("PCI: Mark AMD Navi14 GPU ATS as broken")
  a2da5d8cc0 ("PCI: Mark AMD Raven iGPU ATS as broken in some platforms")
  45beb31d3a ("PCI: Mark AMD Navi10 GPU rev 0x00 ATS as broken")
  5e89cd303e ("PCI: Mark AMD Navi14 GPU rev 0xc5 ATS as broken")
  d28ca864c4 ("PCI: Mark AMD Stoney Radeon R7 GPU ATS as broken")
  9b44b0b09d ("PCI: Mark AMD Stoney GPU ATS as broken")

[bhelgaas: add symptoms and related commits]
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1760
Link: https://lore.kernel.org/r/20220222160801.841643-1-alexander.deucher@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
2022-02-23 12:33:32 -06:00
..
controller PCI: mvebu: Fix device enumeration regression 2022-02-14 09:34:23 -06:00
endpoint Merge branch 'pci/misc' 2022-01-13 09:57:52 -06:00
hotplug Merge branch 'pci/errors' 2022-01-13 09:57:52 -06:00
msi PCI/MSI: Unbreak pci_irq_get_affinity() 2021-12-18 20:33:21 +01:00
pcie Merge branch 'pci/errors' 2022-01-13 09:57:52 -06:00
switch PCI/switchtec: Declare local state_names[] as static 2021-11-19 12:14:02 -06:00
access.c PCI: Use PCI_ERROR_RESPONSE to identify config read errors 2021-11-18 14:31:43 -06:00
ats.c PCI: Allow PASID on fake PCIe devices without TLP prefixes 2021-08-26 14:21:42 -05:00
bus.c
ecam.c PCI: Dynamically map ECAM regions 2021-06-16 17:20:40 -05:00
host-bridge.c PCI: VMD: ACPI: Make ACPI companion lookup work for VMD bus 2021-09-02 17:59:58 +02:00
iov.c Revert "PCI: Use to_pci_driver() instead of pci_dev->driver" 2021-11-11 13:36:22 -06:00
irq.c
Kconfig PCI: hv: Add arm64 Hyper-V vPCI support 2022-01-12 08:24:29 -06:00
Makefile PCI/MSI: Move code into a separate directory 2021-12-09 11:52:22 +01:00
mmap.c
of.c PCI: Correct misspelled words 2022-01-07 20:43:23 -06:00
p2pdma.c pci-v5.17-changes 2022-01-16 08:08:11 +02:00
pci-acpi.c Merge branch 'pm-pci' 2021-11-02 19:06:30 +01:00
pci-bridge-emul.c Merge branch 'remotes/lorenzo/pci/bridge-emul' 2022-01-13 09:57:51 -06:00
pci-bridge-emul.h PCI: pci-bridge-emul: Add PCIe Root Capabilities Register 2021-08-05 10:51:49 +01:00
pci-driver.c Revert "PCI: Use to_pci_driver() instead of pci_dev->driver" 2021-11-11 13:36:22 -06:00
pci-label.c PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in "show" functions 2021-06-03 22:14:47 -05:00
pci-mid.c PCI: PM: Do not use pci_platform_pm_ops for Intel MID PM 2021-09-27 17:13:21 +02:00
pci-pf-stub.c
pci-stub.c
pci-sysfs.c PCI/sysfs: Use pci_irq_vector() 2021-12-09 11:52:21 +01:00
pci.c pci-v5.17-changes 2022-01-16 08:08:11 +02:00
pci.h pci-v5.16-changes 2021-11-06 14:36:12 -07:00
probe.c pci-v5.17-changes 2022-01-16 08:08:11 +02:00
proc.c proc: remove PDE_DATA() completely 2022-01-22 08:33:37 +02:00
quirks.c PCI: Mark all AMD Navi10 and Navi14 GPU ATS as broken 2022-02-23 12:33:32 -06:00
remove.c PCI: Remove reset_fn field from pci_dev 2021-08-17 17:44:38 -05:00
rom.c PCI: Prefer 'unsigned int' over bare 'unsigned' 2021-10-27 13:41:22 -05:00
search.c
setup-bus.c PCI: Prefer 'unsigned int' over bare 'unsigned' 2021-10-27 13:41:22 -05:00
setup-irq.c PCI: Tidy comments 2021-09-28 13:43:17 -05:00
setup-res.c PCI: Work around Intel I210 ROM BAR overlap defect 2022-01-11 09:33:10 -06:00
slot.c PCI/sysfs: Use default_groups in kobj_type for slot attrs 2021-12-29 13:42:04 -06:00
syscall.c PCI: Return int from pciconfig_read() syscall 2021-08-03 16:55:48 -05:00
vc.c
vpd.c PCI/VPD: Use pci_read_vpd_any() in pci_vpd_size() 2021-10-25 19:12:23 -05:00
xen-pcifront.c xen/pcifront: Rework MSI handling 2021-12-16 22:22:18 +01:00