Since amdgpu has always requested PCIE atomics, kfd don't
need duplicated PCIE atomics enablement. Referring to amdgpu
request result is enough.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GWS bo is shared between all kfd processes. Add function to add gws
to kfd process's bo list so gws can be evicted from and restored
for process.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add amdgpu_amdkfd interface to alloc and free gws
from amdgpu
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add amdgpu_amdkfd interface to get num_gws and add num_gws
to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report
num_gws if MEC FW support GWS barriers. Currently it is
determined by a module parameter which will be replaced
with MEC FW version check when firmware is ready.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use HMM helper function hmm_vma_fault() to get physical pages backing
userptr and start CPU page table update track of those pages. Then use
hmm_vma_range_done() to check if those pages are updated before
amdgpu_cs_submit for gfx or before user queues are resumed for kfd.
If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart
from scratch, for kfd, restore worker is rescheduled to retry.
HMM simplify the CPU page table concurrent update check, so remove
guptasklock, mmu_invalidations, last_set_pages fields from
amdgpu_ttm_tt struct.
HMM does not pin the page (increase page ref count), so remove related
operations like release_pages(), put_page(), mark_page_dirty().
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KFD need to provide the info for upper level to determine the data path
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Introduce a new memory type (KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP) and
expose mmio page of HDP registers to user space through this new
memory type.
v2: moved remapped hdp regs to adev struct
v3: rename the new memory type to ALLOC_MEM_FLAGS_MMIO_REMAP
v4: use more generic function name
v5: Fail remapped mmio allocation for asics before gfx9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Method of getting firmware version is the same across ASICs, so remove
them from ASIC-specific files and create one in amdgpu_amdkfd.c. This new
created get_fw_version simply reads fw_version from adev->gfx than parsing
the ucode header.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.
v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use HMM helper function hmm_vma_fault() to get physical pages backing
userptr and start CPU page table update track of those pages. Then use
hmm_vma_range_done() to check if those pages are updated before
amdgpu_cs_submit for gfx or before user queues are resumed for kfd.
If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart
from scratch, for kfd, restore worker is rescheduled to retry.
HMM simplify the CPU page table concurrent update check, so remove
guptasklock, mmu_invalidations, last_set_pages fields from
amdgpu_ttm_tt struct.
HMM does not pin the page (increase page ref count), so remove related
operations like release_pages(), put_page(), mark_page_dirty().
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kgd2kfd function pointers and global kgd2kfd pointer are no longer in use.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since amdkfd is merged into amdgpu module and amdgpu can access amdkfd
directly, move declaration of kgd2kfd functions from kfd_priv.h to
amdgpu_amdkfd.h
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is used for interoperability between ROCm compute and graphics
APIs. It allows importing graphics driver BOs into the ROCm SVM
address space for zero-copy GPU access.
The API is split into two steps (query and import) to allow user mode
to manage the virtual address space allocation for the imported buffer.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We don't want KFD processes evicting each other over VRAM usage.
Therefore prevent overcommitting VRAM among KFD applications with
a per-GPU limit. Also leave enough room for page tables on top
of the application memory usage.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoid including mmu_context.h in amdgpu_amdkfd.h since that may be
included in other header files that define traces. This leads to
conflicts due to traces defined in other headers included via
mmu_context.h.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add amdgpu_amdkfd_ prefix to amdgpu functions served for amdkfd usage.
v2: fix indentation
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_gpuvm_get_process_page_dir should return the page table address
in the format expected by the pm4_map_process packet for all ASIC
generations.
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.
The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Retrieve hive_id from amdgpu device
v2: compile fix
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For compute vm acquired from amdgpu, vm.pasid is managed
by kfd. Decouple pasid from such vm on process destroy
to avoid duplicate pasid release.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
To make a amdgpu vm to a compute vm, the old pasid will be freed and
replaced with a pasid managed by kfd. Kfd can't reuse original pasid
allocated by amdgpu because kfd uses different pasid policy with amdgpu.
For example, all graphic devices share one same pasid in a process.
v2: rebase (Alex)
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This allows automatic switching to the compute power profile depending
on compute activity.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Eric Huang <JinHuiEric.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Hook up the gpu_recover callback from KFD to amdgpu to enable
handling of GPU hangs detected by KFD.
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
amdgpu save the vm fault related information for KFD usage and keep the
copy until KFD read it.
Signed-off-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
In case CONFIG_HSA_AMD is not chosen, there is no need to compile amdkfd
files that reside inside amdgpu dirver. In addition, because amdkfd
depends on x86_64 architecture and amdgpu is not, compiling amdkfd files
under i386 architecture can cause compiler errors and warnings.
This patch modifies amdgpu's makefile to build amdkfd files only if
CONFIG_HSA_AMD is chosen. The only file to be compiled unconditionally
is amdgpu_amdkfd.c
There are stub functions that are compiled only if amdkfd is not
compiled. In that case, calls from amdgpu driver proper will go to those
functions instead of the real functions.
v2: instead of using function pointers, use stub functions
v3: initialize kgd2kfd to NULL in case amdkfd is not compiled
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This adds support for allocating, mapping, unmapping and freeing
userptr BOs, and for handling MMU notifiers.
v2: updated a comment
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This commit adds the notion of MMU notifier types GFX and HSA. GFX
continues to work like MMU notifiers did before. HSA adds support for
KFD userptr BOs. The implementation of KFD userptr eviction is a stub
for now.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This allows acquiring an existing VM from a render node FD to use it
for a compute process.
Such VMs get destroyed when the original file descriptor is released.
Added a callback from amdgpu_vm_fini to handle KFD VM destruction
correctly in this case.
v2:
* Removed vm->vm_context check in amdgpu_amdkfd_gpuvm_destroy_cb,
check vm->process_info earlier instead
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Remove struct amdkfd_vm and move the fields into struct amdgpu_vm.
This will allow turning a VM created by a DRM render node into a
KFD VM.
v2: Removed vm_context field
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This can be used for flushing caches when not using the HWS.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
v2:
* Removed unused flags from struct kgd_mem
* Updated some comments
* Added a check to unmap_memory_from_gpu whether BO was mapped
v3: add mutex_destroy in relevant places
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add GPUVM size and DRM render node. Also add function to query the
VMID mask to avoid hard-coding it in multiple places later.
v2: cut off GPUVM size at the VA hole
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This fence is used by KFD to keep memory resident while user mode
queues are enabled. Trying to evict memory will trigger the
enable_signaling callback, which starts a KFD eviction, which
involves preempting user mode queues before signaling the fence.
There is one such fence per process.
v2:
* Grab a reference to mm_struct
* Dereference fence after NULL check
* Simplify fence release, no need to signal without anyone waiting
* Added signed-off-by Harish, who is the original author of this code
v3:
* update MAINTAINERS file
* change amd_kfd_ prefix to amdkfd_
* remove useless initialization of variable to NULL
v4:
* set amdkfd_fence_ops to be static
* Suggested by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Add functions to report the vram_usage from the amdgpu_device
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Various bug fixes and improvements that accumulated over the last two
years.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
kfd2kgd is device-specific, so it should not be a global variable.
Merge amdgpu_amdkfd_load_interface and amdgpu_amdkfd_device_probe
so that it's only needed as a local variable in one function.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename straggler instances of r(adeon)dev to a(mdgpu)dev
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu must load only after amdkfd's loading has been completed. If that
is not enforced, then amdgpu's call into amdkfd's functions will cause a
kernel BUG.
When amdgpu and amdkfd are built as kernel modules, that rule is enforced
by the kernel's modules loading mechanism. When amdgpu and amdkfd are
built inside the kernel image, that rule is enforced by ordering in the
drm Makefile (amdkfd before amdgpu).
Instead of using drm Makefile ordering, we can now use deferred loading
as amdkfd now returns -EPROBE_DEFER in kgd2kfd_init() when it is not yet
loaded.
This patch defers amdgpu loading by propagating -EPROBE_DEFER to the
kernel's drivers loading infrastructure. That will put amdgpu into the
pending drivers list (see description in dd.c). Once amdkfd is loaded,
a call to kgd2kfd_init() will return successfully and amdgpu will be able
to load.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds the gfx8 interface file between amdgpu and amdkfd. This
interface file is currently in use when running on a Carrizo-based
system.
The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside amdgpu_device structure.
All the register accesses that amdkfd need are done using this
interface. This allows us to avoid direct register accesses in
amdkfd proper, while also allows us to avoid locking between
amdkfd and amdgpu.
The single exception is the doorbells that are used in both of
the drivers. However, because they are located in separate pci
bar pages, the danger of sharing registers between the drivers
is minimal.
Having said that, we are planning to move the doorbells as well
to amdgpu.
Signed-off-by: Ben Goz <ben.goz@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
This patch adds an interface file between amdgpu and amdkfd. This
interface file is H/W agnostic, thus containing functions that
operate the same for any AMD APU/GPU H/W generation.
The functions in this interface mirror (some) of the functions in
radeon_kfd.c (the radeon<-->amdkfd interface file). The main functions
are:
- amdgpu_amdkfd_init - initialize the amdkfd module
- amdgpu_amdkfd_load_interface - load the H/W interface according to the
currently probed device
- amdgpu_amdkfd_device_probe - probe the device in amdkfd
- amdgpu_amdkfd_device_init - initialize the device in amdkfd
- amdgpu_amdkfd_interrupt - call the ISR of amdkfd
- amdgpu_amdkfd_suspend - suspend callback from amdgpu
- amdgpu_amdkfd_resume - resume callback from amdgpu
This patch also modifies the relevant amdgpu files, to use this new
interface.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>