linux/drivers/gpu/drm/amd/amdkfd
Alex Sierra ec6abe831a drm/amdkfd: rm BO resv on validation to avoid deadlock
This fix the deadlock with the BO reservations during SVM_BO evictions
while allocations in VRAM are concurrently performed. More specific,
while the ttm waits for the fence to be signaled (ttm_bo_wait), it
already has the BO reserved. In parallel, the restore worker might be
running, prefetching memory to VRAM. This also requires to reserve the
BO, but blocks the mmap semaphore first. The deadlock happens when the
SVM_BO eviction worker kicks in and waits for the mmap semaphore held
in restore worker. Preventing signal the fence back, causing the
deadlock until the ttm times out.

We don't need to hold the BO reservation anymore during validation
and mapping. Now the physical addresses are taken from hmm_range_fault.
We also take migrate_mutex to prevent range migration while
validate_and_map update GPU page table.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <philip.yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-10-08 13:20:31 -04:00
..
cik_event_interrupt.c drm/amdkfd: Add kernel parameter to stop queue eviction on vm fault 2021-03-23 22:59:22 -04:00
cik_int.h
cik_regs.h
cwsr_trap_handler_gfx8.asm drm/amdkfd: Add aldebaran trap handler support 2021-03-10 00:02:24 -05:00
cwsr_trap_handler_gfx9.asm drm/amdkfd: Fix saving the ACC vgprs for Aldebaran 2021-03-23 22:56:55 -04:00
cwsr_trap_handler_gfx10.asm
cwsr_trap_handler.h drm/amdkfd: Fix saving the ACC vgprs for Aldebaran 2021-03-23 22:56:55 -04:00
Kconfig drm/amdkfd: Add CONFIG_HSA_AMD_SVM 2021-04-20 21:50:35 -04:00
kfd_chardev.c drm/amdkfd: Only apply heavy-weight TLB flush on Aldebaran 2021-08-02 17:21:25 -04:00
kfd_crat.c drm/amdkfd: enable cyan_skillfish KFD 2021-07-23 10:08:01 -04:00
kfd_crat.h
kfd_dbgdev.c drm/amdkfd: dqm fence memory corruption 2021-04-09 16:47:06 -04:00
kfd_dbgdev.h
kfd_dbgmgr.c
kfd_dbgmgr.h
kfd_debugfs.c drm/amdkfd: Fix cat debugfs hang_hws file causes system crash bug 2021-04-09 16:42:11 -04:00
kfd_device_queue_manager_cik.c
kfd_device_queue_manager_v9.c drm/amdkfd: add xnack enabled flag to kfd_process 2021-04-20 21:47:41 -04:00
kfd_device_queue_manager_v10.c
kfd_device_queue_manager_vi.c
kfd_device_queue_manager.c drm/amdkfd: CWSR with software scheduler 2021-08-11 17:19:54 -04:00
kfd_device_queue_manager.h drm/amdkfd: Renaming dqm->packets to dqm->packet_mgr 2021-07-23 10:07:59 -04:00
kfd_device.c drm/amdkfd: remove redundant iommu cleanup code 2021-10-05 12:22:15 -04:00
kfd_doorbell.c
kfd_events.c drm/amdkfd: fix a resource leakage issue 2021-05-19 22:44:12 -04:00
kfd_events.h
kfd_flat_memory.c drm/amdkfd: enable cyan_skillfish KFD 2021-07-23 10:08:01 -04:00
kfd_int_process_v9.c amd/amdkfd: add ras page retirement handling for sq/sdma (v3) 2021-10-04 15:22:57 -04:00
kfd_interrupt.c
kfd_iommu.c IOMMU Updates for Linux v5.13 2021-05-01 09:33:00 -07:00
kfd_iommu.h drm/amdkfd: fix build error with AMD_IOMMU_V2=m 2021-03-23 23:28:11 -04:00
kfd_kernel_queue.c
kfd_kernel_queue.h
kfd_migrate.c drm/amdkfd: fix resource_size.cocci warnings 2021-09-29 17:30:00 -04:00
kfd_migrate.h drm/amdkfd: fix svm_migrate_fini warning 2021-09-23 16:34:57 -04:00
kfd_module.c
kfd_mqd_manager_cik.c drm/amdkfd: Check HIQ's MQD for queue preemption status 2021-03-23 22:59:25 -04:00
kfd_mqd_manager_v9.c drm/amdkfd: Check HIQ's MQD for queue preemption status 2021-03-23 22:59:25 -04:00
kfd_mqd_manager_v10.c drm/amdkfd: Check HIQ's MQD for queue preemption status 2021-03-23 22:59:25 -04:00
kfd_mqd_manager_vi.c drm/amdkfd: Check HIQ's MQD for queue preemption status 2021-03-23 22:59:25 -04:00
kfd_mqd_manager.c drm/amdkfd: Account for SH/SE count when setting up cu masks. 2021-08-26 13:56:01 -04:00
kfd_mqd_manager.h drm/amdkfd: Account for SH/SE count when setting up cu masks. 2021-08-26 13:56:01 -04:00
kfd_packet_manager_v9.c drm/amdkfd: add per-vmid-debug map_process_support 2021-04-23 17:16:05 -04:00
kfd_packet_manager_vi.c drm/amdkfd: dqm fence memory corruption 2021-04-09 16:47:06 -04:00
kfd_packet_manager.c drm/amdkfd: enable cyan_skillfish KFD 2021-07-23 10:08:01 -04:00
kfd_pasid.c
kfd_pm4_headers_ai.h
kfd_pm4_headers_aldebaran.h drm/amdkfd: add per-vmid-debug map_process_support 2021-04-23 17:16:05 -04:00
kfd_pm4_headers_diq.h
kfd_pm4_headers_vi.h
kfd_pm4_headers.h
kfd_pm4_opcodes.h
kfd_priv.h drm/amdkfd: make needs_pcie_atomics FW-version dependent 2021-09-14 15:56:50 -04:00
kfd_process_queue_manager.c drm/amdkfd: fix sysfs kobj leak 2021-06-30 00:18:23 -04:00
kfd_process.c Revert "Revert "drm/amdkfd: Make TLB flush conditional on mapping"" 2021-08-02 17:21:24 -04:00
kfd_queue.c
kfd_smi_events.c drm/amdkfd: Update SMI throttle event bitmask 2021-07-23 10:08:00 -04:00
kfd_smi_events.h drm/amdkfd: Update SMI throttle event bitmask 2021-07-23 10:08:00 -04:00
kfd_svm.c drm/amdkfd: rm BO resv on validation to avoid deadlock 2021-10-08 13:20:31 -04:00
kfd_svm.h drm/amdkfd: check access permisson to restore retry fault 2021-08-24 15:36:50 -04:00
kfd_topology.c drm/amdkfd: Expose GFXIP engine version to sysfs 2021-08-05 21:18:00 -04:00
kfd_topology.h drm/amdkfd: Expose GFXIP engine version to sysfs 2021-08-05 21:18:00 -04:00
Makefile drm/amdkfd: Add CONFIG_HSA_AMD_SVM 2021-04-20 21:50:35 -04:00
soc15_int.h drm/amdkfd: add sdma poison consumption handling 2021-06-07 14:57:24 -04:00