linux

Author	SHA1	Message	Date
Victor Zhao	db7f1e0140	drm/amdgpu: fix r initial values Sriov gets suspend of IP block <dce_virtual> failed as return value was not initialized. v2: return 0 directly to align original code semantic before this was broken out into a separate helper function instead of setting initial values Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-10 18:06:42 -04:00
Dennis Li	0e0036c7d1	drm/amdgpu: fix no full coverage issue for gprs initialization The wave's number per simd in aldebaran is changed to 8, so it is impossible to use old algorithm to initiate all sgprs with one threadgroup. The new algorithm firstly use three threadgroups to initiate most sgprs simultaneously and then use another threadgroup with 4 waves to cover other uninitiated sgprs. v2: Add more description about the new algorithm to clear sgprs and add some comment for shader binaries Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:05 -04:00
Philip Yang	36255b5f61	drm/amdgpu: address remove from fault filter Add interface to remove address from fault filter ring by resetting fault ring entry key, then future vm fault on the address will be processed to recover. Define fault key as atomic64_t type to use atomic read/set/cmpxchg key to protect fault ring access by interrupt handler and interrupt deferred work for vg20. Change fault->timestamp to 48-bit to share same uint64_t with 8-bit fault->next, it is enough for 48bit IH timestamp. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:05 -04:00
Philip Yang	11dd55d174	drm/amdgpu: return IH ring drain finished if ring is empty Sometimes IH do not setup ring wptr overflow flag after wptr exceed rptr. As a workaround, if IH rptr equals to wptr, ring is empty, return true to indicate IH ring checkpoint is processed, IH ring drain is finished. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:05 -04:00
Jack Zhang	95ea3dbc4e	drm/amd/amdgpu/sriov disable all ip hw status by default Disable all ip's hw status to false before any hw_init. Only set it to true until its hw_init is executed. The old 5.9 branch has this change but somehow the 5.11 kernrel does not have this fix. Without this change, sriov tdr have gfx IB test fail. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Review-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:05 -04:00
Christian König	dd03daec0f	drm/amdgpu: restructure amdgpu_vram_mgr_new Merge the two loops, loosen the restriction for big allocations. This reduces the CPU overhead in the good case, but increases it a bit under memory pressure. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-and-Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:05 -04:00
Feifei Xu	5d11699914	drm/amdgpu: Correct and simplify sdma 4.x irq.num_types Correct and init the sdma4.x irq.num_types. v2: squash in fix (Alex) Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:36:04 -04:00
Feifei Xu	3d2bee9188	drm/amdgpu: Change the sdma interrupt print level Change the print level into debug. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:51 -04:00
Victor Zhao	041e69160d	drm/amdgpu/sriov: Remove clear vf fw support PSP clear_vf_fw feature is outdated and has been removed. Remove the related functions. Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:51 -04:00
Hawking Zhang	a30f128602	drm/amdgpu: provide socket/die id info in RAS message Add socket/die information in RAS messages for platforms that support query those information Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:50 -04:00
Hawking Zhang	dfdd4b8a95	drm/amdgpu: implement smuio callback to query socket id get_socket_id is used to query socket id Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:49 -04:00
Jinzhou Su	ec0f72cb95	drm/amdgpu: Enable SDMA LS for Vangogh Add flags AMD_CG_SUPPORT_SDMA_LS for Vangogh. Start to open sdma ls from firmware version 70. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:49 -04:00
Souptick Joarder	5f5cb2afd6	drm/amdgpu: Added missing prototype Kernel test robot throws below warning -> >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:125:5: warning: >> no previous prototype for 'kgd_arcturus_hqd_sdma_load' >> [-Wmissing-prototypes] 125 \| int kgd_arcturus_hqd_sdma_load(struct kgd_dev kgd, void mqd, >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:195:5: warning: >> no previous prototype for 'kgd_arcturus_hqd_sdma_dump' >> [-Wmissing-prototypes] 195 \| int kgd_arcturus_hqd_sdma_dump(struct kgd_dev kgd, >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:227:6: warning: >> no previous prototype for 'kgd_arcturus_hqd_sdma_is_occupied' >> [-Wmissing-prototypes] 227 \| bool kgd_arcturus_hqd_sdma_is_occupied(struct kgd_dev kgd, void mqd) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c:246:5: warning: >> no previous prototype for 'kgd_arcturus_hqd_sdma_destroy' >> [-Wmissing-prototypes] 246 \| int kgd_arcturus_hqd_sdma_destroy(struct kgd_dev kgd, void *mqd, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Added prototype for these functions. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-28 23:35:49 -04:00
Christian König	3dc7216c1d	drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2 Starting with Vega the hardware supports concurrent flushes of VMID which can be used to implement per process VMID allocation. But concurrent flushes are mutual exclusive with back to back VMID allocations, fix this to avoid a VMID used in two ways at the same time. v2: don't set ring to NULL Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:19:05 -04:00
Nirmoy Das	42daecfc20	drm/amdgpu: remove AMDGPU_GEM_CREATE_SHADOW flag Remove unused AMDGPU_GEM_CREATE_SHADOW flag. Userspace is never allowed to use this. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:56 -04:00
Nirmoy Das	cd2454d6cd	drm/amdgpu: cleanup amdgpu_bo_create() Remove shadow bo related code as vm code is creating shadow bo using proper API. Without shadow bo code, amdgpu_bo_create() is basically a wrapper around amdgpu_bo_do_create(). So rename amdgpu_bo_do_create() to amdgpu_bo_create(). Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:47 -04:00
Nirmoy Das	adf6f5c51e	drm/amdgpu: create shadow bo using amdgpu_bo_create_shadow() Shadow BOs are only needed for vm code so call amdgpu_bo_create_shadow() directly instead of depending on amdgpu_bo_create(). Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:32 -04:00
Nirmoy Das	77df5c131d	drm/amdgpu: remove unused vm context flags Remove unused AMDGPU_VM_CONTEXT_GFX and AMDGPU_VM_CONTEXT_COMPUTE flags. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:26 -04:00
Nirmoy Das	a35455d065	drm/amdgpu: cleanup amdgpu_vm_init() Currently only way to create compute vm is through amdgpu_vm_make_compute(). So vm_context isn't required anymore for amdgpu_vm_init(). Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:19 -04:00
Nirmoy Das	25e9146ae6	drm/amdgpu: expose amdgpu_bo_create_shadow() Exposed amdgpu_bo_create_shadow() will be needed for amdgpu_vm handling. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:10 -04:00
Christian König	589939d401	drm/amdgpu: fix coding style and documentation in amdgpu_vram_mgr.c No functional changes, just cleaning up some leftovers and improve documentation. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.aiemd@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:18:02 -04:00
Christian König	a614b336f1	drm/amdgpu: fix coding style and documentation in amdgpu_gtt_mgr.c Avoid the forward define, fix coding style, add documentation. No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.aiemd@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:17:56 -04:00
Alex Sierra	126bbd4ab5	drm/amdgpu: extend xnack limit page fault timeout Extending this timeout will prevent IH from storm interrupts coming from SDMA while a page fault is active. Currently, on Aldebaran, handling that many interrupts can take a lot of CPU time (up to 4 seconds). This eventually causes timeouts in other tasks. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:16:20 -04:00
Hawking Zhang	502f0e2804	drm/amdgpu: disable gfx ras by default in aldebaran aldebaran gfx ras is still under development Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:16:14 -04:00
Kenneth Feng	1d712be90a	drm/amd/amdgpu: add cgls enable cgls to improve the runtime power efficiency. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:15:59 -04:00
Stanley.Yang	19d0dfda4c	drm/amdgpu: optimize gfx ras features flag clean Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:15:52 -04:00
Jinzhou Su	ef9bcfde9e	drm/amdgpu: Enable SDMA MGCG for Vangogh Add flags AMD_CG_SUPPORT_SDMA_MGCG for Vangogh. Start to open sdma mgcg from firmware version 70. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:15:39 -04:00
John Clements	7e882aee84	drm/amdgpu: add support for ras init flags conditionally configure ras for dgpu mode or poison propogation mode Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:15:33 -04:00
Dennis Li	6effe77972	drm/amdgpu: refine gprs init shaders to check coverage Add codes to check whether all SIMDs are covered, make sure that all GPRs are initialized. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-23 17:15:21 -04:00
Lee Jones	3bffd71deb	drm/amd/amdgpu/amdgpu_cs: Repair some function naming disparity Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:685: warning: expecting prototype for cs_parser_fini(). Prototype was for amdgpu_cs_parser_fini() instead drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1502: warning: expecting prototype for amdgpu_cs_wait_all_fence(). Prototype was for amdgpu_cs_wait_all_fences() instead drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c:1656: warning: expecting prototype for amdgpu_cs_find_bo_va(). Prototype was for amdgpu_cs_find_mapping() instead Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jerome Glisse <glisse@freedesktop.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:51:13 -04:00
Lee Jones	03691f5502	drm/amd/amdgpu/amdgpu_ring: Provide description for 'sched_score' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:169: warning: Function parameter or member 'sched_score' not described in 'amdgpu_ring_init' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:51:09 -04:00
Lee Jones	27aa4a69b4	drm/amd/amdgpu/amdgpu_ttm: Fix incorrectly documented function 'amdgpu_ttm_copy_mem_to_mem()' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c:311: warning: expecting prototype for amdgpu_copy_ttm_mem_to_mem(). Prototype was for amdgpu_ttm_copy_mem_to_mem() instead Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jerome Glisse <glisse@freedesktop.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:51:06 -04:00
Lee Jones	777d9000d9	drm/amd/amdgpu/amdgpu_gart: Correct a couple of function names in the docs Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:73: warning: expecting prototype for amdgpu_dummy_page_init(). Prototype was for amdgpu_gart_dummy_page_init() instead drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c:96: warning: expecting prototype for amdgpu_dummy_page_fini(). Prototype was for amdgpu_gart_dummy_page_fini() instead Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:51:02 -04:00
Lee Jones	b16cc4bb1a	drm/amd/amdgpu/amdgpu_fence: Provide description for 'sched_score' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c:444: warning: Function parameter or member 'sched_score' not described in 'amdgpu_fence_driver_init_ring' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Jerome Glisse <glisse@freedesktop.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:50:59 -04:00
Lee Jones	2196927bcb	drm/amd/amdgpu/amdgpu_device: Remove unused variable 'r' Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_suspend’: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3733:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable] Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:50:48 -04:00
Alex Sierra	485bea1f90	drm/amdgpu: add svm_bo eviction to enable_signal cb Add to amdgpu_amdkfd_fence.enable_signal callback, support for svm_bo fence eviction. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:49:56 -04:00
Alex Sierra	5f319c5c21	drm/amdgpu: svm bo enable_signal call condition [why] To support svm bo eviction mechanism. [how] If the BO crated has AMDGPU_AMDKFD_CREATE_SVM_BO flag set, enable_signal callback will be called inside amdgpu_evict_flags. This also causes gutting of the BO by removing all placements, so that TTM won't actually do an eviction. Instead it will discard the memory held by the BO. This is needed for HMM migration to user mode system memory pages. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:49:49 -04:00
Alex Sierra	f04c79cfba	drm/amdgpu: add param bit flag to create SVM BOs Add CREATE_SVM_BO define bit for SVM BOs. Another define flag was moved to concentrate these KFD type flags in one include file. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:49:31 -04:00
Alex Sierra	eb2cec5537	drm/amdkfd: add svm_bo reference for eviction fence [why] As part of the SVM functionality, the eviction mechanism used for SVM_BOs is different. This mechanism uses one eviction fence per prange, instead of one fence per kfd_process. [how] A svm_bo reference to amdgpu_amdkfd_fence to allow differentiate between SVM_BO or regular BO evictions. This also include modifications to set the reference at the fence creation call. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:49:22 -04:00
Alex Sierra	ea53af8a59	drm/amdkfd: SVM API call to restore page tables Use SVM API to restore page tables when retry fault and compute context are enabled. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:49:15 -04:00
Alex Sierra	9dd9cc2f74	drm/amdgpu: enable 48-bit IH timestamp counter By default this timestamp is 32 bit counter. It gets overflowed in around 10 minutes. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:48:58 -04:00
Philip Yang	c46ebb6a6d	drm/amdkfd: set memory limit to avoid OOM with HMM enabled HMM migration alloc sizeof(struct page) on system memory for each VRAM page, it is 1GB system memory reserved for 64GB VRAM. To avoid application OOM, increase system memory used size based on VRAM size of all GPUs, then application alloc memory will fail if system memory usage reach the limit. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:48:00 -04:00
Felix Kuehling	9705c85ff2	drm/amdgpu: Enable retry faults unconditionally on Aldebaran This is needed to allow per-process XNACK mode selection in the SQ when booting with XNACK off by default. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Tested-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:47:34 -04:00
Felix Kuehling	f80fe9d3c1	drm/amdkfd: map svm range to GPUs Use amdgpu_vm_bo_update_mapping to update GPU page table to map or unmap svm range system memory pages address to GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:47:21 -04:00
Philip Yang	d27afacfea	drm/amdgpu: export vm update mapping interface It will be used by kfd to map svm range to GPU, because svm range does not have amdgpu_bo and bo_va, cannot use amdgpu_bo_update interface, use amdgpu vm update interface directly. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:47:13 -04:00
Philip Yang	d8a3c1c80c	drm/amdkfd: support larger svm range allocation For larger range allocation, if hmm_range_fault return -EBUSY, set retry timeout based on 1 second for every 512MB, this is safe timeout value. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:46:50 -04:00
Philip Yang	04d8d73dbc	drm/amdgpu: add common HMM get pages function Move the HMM get pages function from amdgpu_ttm and to amdgpu_mn. This common function will be used by new svm APIs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:46:43 -04:00
Felix Kuehling	cccbeb6209	drm/amdgpu: Remove verify_access shortcut for KFD BOs This shortcut is no longer needed with access managed properly by KFD. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:46:01 -04:00
Felix Kuehling	d4ec4bdc0b	drm/amdkfd: Allow access for mmapping KFD BOs DRM render node file handles are used for CPU mapping of BOs using mmap by the Thunk. It uses the DRM render node of the GPU where the BO was allocated. DRM allows mmap access automatically when it creates a GEM handle for a BO. KFD BOs don't have GEM handles, so KFD needs to manage access manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was used in the kfd_ioctl_acquire_vm call for the same GPU. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:45:54 -04:00
Felix Kuehling	b40a6ab2cf	drm/amdkfd: Use drm_priv to pass VM from KFD to amdgpu amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-20 21:45:45 -04:00

1 2 3 4 5 ...

9078 Commits