If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done while
handling retry fault.
Remove VMC from IH storm client, enable ring1 write pointer overflow,
then IH will drop retry fault interrupts and be able to receive other
interrupts while driver is handling retry fault.
IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
With the current kfd memory accounting scheme, kfd applications
can use up to 15/16 of total system memory. For system which
has small total system memory size it leaves small system memory
for OS. For example, if the system has totally 16GB of system
memory, this scheme leave OS and non-kfd applications only 1GB
of system memory. In many cases, this leads to OOM killer.
This patch changed the KFD system memory accounting scheme.
15/16 of free system memory when kfd driver load. This deduct
the system memory that OS already use.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v1: The interrupts need to be enabled to move to DS clocks.
v2: Don't enable GFX IDLE interrupts if there are no GFX rings.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Aldebaran clock gating support for GFX,SDMA,IH blocks
VCN/JPEG blocks are excluded in this patch, to be enabled later
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
hdp read cache is removed in aldebaran. don't issue
an mmio write or write data packet to hardware.
v2: rebase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
This causes infinite retries on the UTCL1 RB, preventing
higher priority RB such as paging RB.
[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For ASICs that don't support ip discovery feature, query
gfx configuration through atomfirmware interface, rather
than gpu_info firmware.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Disable PCIe BAR resizing on A+A config. It's not needed because we won't use the
PCIe BAR, but it breaks the PCI BAR configuration with the current SBIOS.
Error message of FB BAR resize failure under A+A:
[ 154.913731] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-22).
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.kuehling@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For A+A configuration, device memory is supposed to be mapped as
cachable from CPU side. For kernel pre-map gpu device memory using
ioremap_cache
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
sdma ras function is the main structure to support
sdma ras on aldebaran. the patch initializes late_init
late_fini callbacks.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li<Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar as xgmi connected gpu nodes, physical_node_id
* segment_size should be used to calculate the offset
of aper_base.
The asic type check is redundant. once physical_node_id
and segment_size are initialized, it should be count
on.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
aldebaran removed gds internal memory for atomic usage.
it only supports gws opcode in kernel like barrier,
semaphore.etc. there won't be usage of gds in either
kernel or pm4 packet. max_wave_id should also be marked
as deprecated for aldebaran.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use MSG_GfxDriverReset for mode reset and retire MSG_Mode1Reset.
Centralize soc15_asic_mode1_reset() and nv_asic_mode1_reset()functions.
Add mode2_reset_is_support() for smu->ppt_funcs.
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
vcn fw front door loading is not functional. comments
out vcn/jpeg ip blocks so people can load amdgpu driver
without specify ip_mask module parameter.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC.
V2: Split patch into upstreamable and aldebaran
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>