Commit Graph

5889 Commits

Author SHA1 Message Date
Hawking Zhang
861324983d drm/amdgpu: correct irq type used for sdma ecc
we should pass irq type, instead of irq client id,
to irq_get/put interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:35 -05:00
Evan Quan
1f96ecef6f drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Le Ma
7d0e6329df drm/amdgpu: update more sdma instances irq support
Update for sdma ras ecc_irq and other minors.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
7c16d24abe drm/amdgpu: correct VCN powergate routine for acturus
Arcturus VCN should powergate in the way as Navi.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:34 -05:00
Evan Quan
fe089e1dd7 drm/amd/powerplay: enable arcturus powerplay
Arcturus powerplay is ready to use.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Evan Quan
cca4fafc09 drm/amd/powerplay: initialize arcturus MP1 and THM base address
Initialize base address for those IPs which are used in powerplay.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Wang Xiayang
1a2c29bce0 drm/amdgpu: fix a potential information leaking bug
Coccinelle reports a path that the array "data" is never initialized.
The path skips the checks in the conditional branches when either
of callback functions, read_wave_vgprs and read_wave_sgprs, is not
registered. Later, the uninitialized "data" array is read
in the while-loop below and passed to put_user().

Fix the path by allocating the array with kcalloc().

The patch is simplier than adding a fall-back branch that explicitly
calls memset(data, 0, ...). Also it does not need the multiplication
1024*sizeof(*data) as the size parameter for memset() though there is
no risk of integer overflow.

Signed-off-by: Wang Xiayang <xywang.sjtu@sjtu.edu.cn>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Christian König
6494120695 drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
We always need to drop the ctx reference and should check
for errors first and then dereference the fence pointer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Thong Thai
c74dbe44ea drm/amd/amdgpu/vcn_v2_0: Move VCN 2.0 specific dec ring test to vcn_v2_0
VCN 2.0 firmware now requires a packet start command to be sent before
any other decode ring buffer command.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:33 -05:00
Alex Deucher
3207dcf3af drm/amdgpu/gfx10: update golden settings for navi14
Updated settings for hw team.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Kevin Wang
98eb03bbf0 drm/amd/powerplay: implment sysfs feature status function in smu
1. Unified feature enable status format in sysfs
2. Rename ppfeature to pp_features to adapt other pp sysfs node name
3. this function support all asic, not asic related function.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Rui Huang <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Joseph Greathouse
2c89731803 drm/amdgpu: Default disable GDS for compute+gfx
Units in the GDS block default to allowing all VMIDs access to all
entries. Disable shader access to the GDS, GWS, and OA blocks from all
compute and gfx VMIDs by default. For compute, HWS firmware will set
up the access bits for the appropriate VMID when a compute queue
requires access to these blocks.
The driver will handle enabling access on-demand for graphics VMIDs.

Leaving VMID0 with full access because otherwise HWS cannot save or
restore values during task switch.

v2: Fixed code and comment styling.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Thong Thai
333fe325fe drm/amd/amdgpu/vcn_v2_0: Mark RB commands as KMD commands
Sets the CMD_SOURCE bit for VCN 2.0 Decoder Ring Buffer commands. This
bit was previously set by the RBC HW on older firmware. Newer firmware
uses a SW RBC and this bit has to be set by the driver.

Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Andrey Grodzovsky
f2bd8a0ed7 drm/amdgpu: Fix amdgpu_display_supported_domains logic.
Add restriction to dissallow GTT domain if the relevant BO
doesn't have USWC flag set to avoid the APU hang scenario.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:48:32 -05:00
Alex Deucher
a3a09142f4 drm/amdgpu: put the SMC into the proper state on reset/unload
When doing a GPU reset or unloading the driver, we need to
put the SMU into the apprpriate state for the re-init after
the reset or unload to reliably work.

I don't think this is necessary for BACO because the SMU actually
controls the BACO state to it needs to be active.

For suspend (S3), the asic is put into D3 so the SMU would be
powered down so I don't think we need to put the SMU into
any special state.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:44 -05:00
Alex Deucher
2ddc6c3ef9 drm/amdgpu: add reset_method asic callback for navi
Navi uses either mode1 or baco depending on various
conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:17 -05:00
Alex Deucher
ee360c0b7c drm/amdgpu: add reset_method asic callback for soc15
APUs only support mode2 reset.  dGPUs use either mode1 or
baco depending on various conditions.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:13 -05:00
Alex Deucher
9bc1932f5c drm/amdgpu: add reset_method asic callback for vi
VI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:10 -05:00
Alex Deucher
6d0f50dafe drm/amdgpu: add reset_method asic callback for cik
CIK always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:24:06 -05:00
Alex Deucher
dd81eede77 drm/amdgpu: add reset_method asic callback for si
SI always uses the legacy pci based reset.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:56 -05:00
Alex Deucher
0cf3c64f29 drm/amdgpu: add an asic callback to determine the reset method
Sometimes the driver may have to behave differently depending
on the method we are using to reset the GPU.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:44 -05:00
Evan Quan
f0d2a7dc11 drm/amd/powerplay: fix null pointer dereference around dpm state relates
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:29 -05:00
Evan Quan
fcd90fee8a drm/amd/powerplay: minor fixes around SW SMU power and fan setting
Add checking for possible invalid input and null pointer. And
drop redundant code.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:21 -05:00
Shirish S
1c42591591 drm/amd/display: enable S/G for RAVEN chip
enables gpu_vm_support in dm and adds
AMDGPU_GEM_DOMAIN_GTT as supported domain

v2:
Move BO placement logic into amdgpu_display_supported_domains

v3:
Use amdgpu_bo_validate_uswc in amdgpu_display_supported_domains.

v4:
amdgpu_bo_validate_uswc moved to sepperate patch.

Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:23:12 -05:00
Andrey Grodzovsky
ddcb7fc62f drm/amdgpu: Add check for USWC support for amdgpu_display_supported_domains
This verifies we don't add GTT as allowed domain for APUs when USWC
is disabled.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:55 -05:00
Andrey Grodzovsky
3d1b8ec76b drm/amdgpu: Create helper to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC
Move the logic to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC in
amdgpu_bo_do_create into standalone helper so it can be reused
in other functions.

v4:
Switch to return bool.

v5: Fix typos.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:48 -05:00
Andrey Grodzovsky
e4c4073b01 drm/amdgpu: Fix hard hang for S/G display BOs.
HW requires for caching to be unset for scanout BO
mappings when the BO placement is in GTT memory.
Usually the flag to unset is passed from user mode
but for FB mode this was missing.

v2:
Keep all BO placement logic in amdgpu_display_supported_domains

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:41 -05:00
Jonathan Kim
24f9aacfb0 drm/amdgpu: adding xgmi error monitoring
monitor xgmi errors via mc pie status through fica registers.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:34 -05:00
Jonathan Kim
64671c0fdc drm/amdgpu: add perfmon and fica atomics for df
adding perfmon and fica atomic operations to adhere to data fabrics finite
state machine requirements for indirect register access.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Kent Russell <Kent.Russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:22:26 -05:00
tiancyin
5f4814deab drm/amdgpu/gmc10: fix pte mytpe field error for navi14
navi14 share same PTE format with navi10.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:35 -05:00
James Zhu
6913848087 drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:28 -05:00
Evan Quan
780f3a9c5b drm/amd/powerplay: some cosmetic fixes
Drop redundant check, duplicate check, duplicate setting
and fix the return value.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:19:13 -05:00
Chuhong Yuan
911d8b3069 drm/amdgpu: Use dev_get_drvdata where possible
Instead of using to_pci_dev + pci_get_drvdata,
use dev_get_drvdata to make code simpler.

Signed-off-by: Chuhong Yuan <hslester96@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:18:41 -05:00
Kent Russell
4cab85afe9 drm/amdkfd: Fix byte align on VegaM
This was missed during the addition of VegaM support

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-30 23:17:35 -05:00
Alex Deucher
95ccc15508 drm/amdgpu/smu: move fan rpm query into the asic specific code
On vega20, there is an SMU message to query it.  On navi, it's fetched
from the metrics table.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:39 -05:00
Hawking Zhang
d52d6de280 drm/amdgpu: set sdma irq src num according to sdma instances
Otherwise, it will cause driver access non-existing sdma registers
in gpu reset code path

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-22 14:57:31 -05:00
Leo Liu
53ef3969dd drm/amdgpu: use VCN firmware offset for cache window
Since we are using the signed FW now, and also using PSP firmware loading,
but it's still potential to break driver when loading FW directly
instead of PSP, so we should add offset.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
33c976c961 drm/amdgpu: drop ras self test
this function is not needed any more. error injection is
the only way to validate ras but it can't be executed in
amdgpu_ras_init, where gpu is even not initialized

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
a5dd40ca81 drm/amdgpu: only allow error injection to UMC IP block
error injection to other IP blocks (except UMC) will be enabled
until RAS feature stablize on those IP blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
4d249d3abd drm/amdgpu: disable GFX RAS by default
GFX RAS has not been stablized yet. disable GFX ras until
it is fully funcitonal.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Hawking Zhang
fb2a36075a drm/amdgpu: do not create ras debugfs/sysfs node for ASICs that don't have ras ability
driver shouldn't init any ras debugfs/sysfs node for ASICs that don't have ras
hardware ability

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Joseph Greathouse
fbdc5d8d84 drm/amdgpu: Default disable GDS for compute VMIDs
The GDS and GWS blocks default to allowing all VMIDs to
access all entries. Graphics VMIDs can handle setting
these limits when the driver launches work. However,
compute workloads under HWS control don't go through the
kernel driver. Instead, HWS firmware should set these
limits when a process is put into a VMID slot.

Disable access to these devices by default by turning off
all mask bits (for OA) and setting BASE=SIZE=0 (for GDS
and GWS) for all compute VMIDs. If a process wants to use
these resources, they can request this from the HWS
firmware (when such capabilities are enabled). HWS will
then handle setting the base and limit for the process when
it is assigned to a VMID.

This will also prevent user kernels from getting 'stuck' in
GWS by accident if they write GWS-using code but HWS
firmware is not set up to handle GWS reset. Until HWS is
enabled to handle GWS properly, all GWS accesses will
MEM_VIOL fault the kernel.

v2: Move initialization outside of SRBM mutex

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
a08a4dae7a drm/amdgpu: flag arcturus as experimental for now
Current support will only work in internal engineering
boards.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Alex Deucher
ad91b134a2 drm/amdgpu: drop unused function definitions
These were dropped and the headers never got cleaned up.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
James Zhu
1da418ba65 drm/amdgpu:add all VCN rings into schedule request queue
Add all VCN instances' decode/encode/jpeg decode rings into
drm_sched_rq list.

Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <boyuan.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Le Ma
69d4de94f8 drm/amdgpu: enable all 8 sdma instances for Arcturus silicon
The more 6 sdma instances work fine now with DF fix in vbios:
  * mmDF_PIE_AON_MiscClientsEnable(0x1c728)=0x3fe(DF_ALL_INSTANCE)
       [9:4]MmhubsEnable=3f (change from 0)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:07 -05:00
Yong Zhao
5ddd4a9a7c drm/amdgpu: Add more detail to the VM fault printing
With the printing, we don't need to parse the value on our own any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Le Ma
fc1e272e8d drm/amdgpu: limit sdma instances to 2 for Arcturus in BU phase
Another 6 sdma instances do not work at present. Disable them to unblock KFD
for silicon bringup as a workaround

Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Hawking Zhang
f9cf36fcaf drm/amdgpu: skip gfx 9 common golden settings for arct
They are not needed by arct

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00
Yong Zhao
54bd77f3d0 amd/powerplay: No SW XGMI dpm for Arcturus rev 2
xgmi dpm is handled by the SMU.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18 14:18:06 -05:00