Dennis Li
2c960ea02f
drm/amdgpu: add RAS callback for gfx
...
Add functions for RAS error inject and query error counter
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:51:08 -05:00
Dennis Li
dc23a08f03
drm/amdgpu: add define for gfx ras subblock
...
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:51:01 -05:00
Dennis Li
4bb6b8c758
drm/amd/include: add define of TCP_EDC_CNT_NEW
...
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:54 -05:00
Dennis Li
ca3f422f53
drm/amd/include: add bitfield define for EDC registers
...
Add EDC registers to support VEGA20 RAS
Signed-off-by: Dennis Li <Dennis.Li@amd.com >
Reviewed-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:47 -05:00
Tao Zhou
7cdc2ee300
drm/amdgpu: remove ras_reserve_vram in ras injection
...
error injection address is not in gpu address space
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:41 -05:00
Tao Zhou
e10634938b
drm/amdgpu: add check for ras error type
...
only ue and ce errors are supported
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:35 -05:00
Tao Zhou
81e02619e9
drm/amdgpu: update interrupt callback for all ras clients
...
add err_data parameter in interrupt cb for ras clients
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:29 -05:00
Tao Zhou
cf04dfd0e9
drm/amdgpu: allow ras interrupt callback to return error data
...
add error data as parameter for ras interrupt cb and process it
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:23 -05:00
Tao Zhou
8c94810357
drm/amdgpu: query umc ras error address
...
query umc ras error address, translate it to gpu 4k page view
and save it.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:17 -05:00
Tao Zhou
c2742aef4d
drm/amdgpu: add structures for umc error address translation
...
add related registers, callback function and channel index table
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:11 -05:00
Tao Zhou
6f102dba80
drm/amdgpu: add support for recording ras error address
...
more than one error address may be recorded in one query
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:50:05 -05:00
Tao Zhou
f1ed4afa13
drm/amdgpu: update algorithm of umc uncorrectable error counting
...
remove the check of ErrorCodeExt
v2: refine the if condition for ue counting
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:58 -05:00
Tao Zhou
045c021653
drm/amdgpu: switch to amdgpu_umc structure
...
create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:52 -05:00
Tao Zhou
5bbfb64a17
drm/amdgpu: use 64bit operation macros for umc
...
replace some 32bit macros with 64bit operations to simplify code
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:46 -05:00
Tao Zhou
4fa1c6a679
drm/amdgpu: add RREG64/WREG64(_PCIE) operations
...
add 64 bits register access functions
v2: implement 64 bit functions in low level
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:40 -05:00
Tao Zhou
05a58345db
drm/amdgpu: add ras error count after each query (v2)
...
v1: increase ras ce/ue error count
v2: log the number of correctable and uncorrectable errors
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:33 -05:00
Hawking Zhang
939e2258ce
drm/amdgpu: querry umc error count
...
check umc error count in both ras querry function and
ras interrupt handler
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:28 -05:00
Hawking Zhang
5b6b35aaac
drm/amdgpu: init umc v6_1 functions for vega20
...
init umc callback function for vega20 in sw early init phase
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:22 -05:00
Hawking Zhang
9884c2b1c3
drm/amdgpu: add umc v6_1 query error count support
...
Implement umc query_ras_error_count function to support querry
both correctable and uncorrectable error
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Signed-off-by: Tao Zhou <tao.zhou1@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:16 -05:00
Hawking Zhang
03c9963f47
drm/amdgpu: add umc v6_1_1 IP headers
...
the change introduces IP headers for unified memory controller (umc)
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:10 -05:00
Hawking Zhang
245219a660
drm/amdgpu: add rsmu v_0_0_2 ip headers
...
remote smu (rsmu) is a sub-block used as ip register interface,
error handling, reset generation.etc
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:49:03 -05:00
Hawking Zhang
9e585a523b
drm/amdgpu: add amdgpu_umc_functions structure
...
This is common structure as UMC callback function
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:57 -05:00
Hawking Zhang
6501a77170
drm/amdgpu: init RSMU and UMC ip base address for vega20
...
the driver needs to program RSMU and UMC registers to
support vega20 RAS feature
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:51 -05:00
Hawking Zhang
7af25d5b7e
drm/amdgpu: move some ras data structure to amdgpu_ras.h
...
These are common structures that can be included by IP specific
source files
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Dennis Li <dennis.li@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:48:32 -05:00
Alex Deucher
fa1884f9d8
drm/amdgpu: drop drmP.h from vcn_v2_5.c
...
Unused.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:41 -05:00
Alex Deucher
9a2ffeb525
drm/amdgpu: drop drmP.h from vcn_v2_0.c
...
And fix the fallout.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:36 -05:00
Alex Deucher
75589f496d
drm/amdgpu: drop drmP.h from sdma_v5_0.c
...
And fix the fallout.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:31 -05:00
Alex Deucher
e9eea90247
drm/amdgpu: drop drmP.h from nv.c
...
And fix up the fallout.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:26 -05:00
Alex Deucher
b23b2e9e49
drm/amdgpu: drop drmP.h from navi10_ih.c
...
And fix the fallout.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:21 -05:00
Alex Deucher
0a069bbe13
drm/amdgpu: drop drmP.h in gfx_v10_0.c
...
And fix the fallout.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:16 -05:00
Alex Deucher
3b90f6ecdf
drm/amdgpu: drop drmP.h from amdgpu_amdkfd_gfx_v10.c
...
Unused.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:33:10 -05:00
Alex Deucher
32978d8cfd
drm/amdgpu: drop drmP.h in amdgpu_amdkfd_arcturus.c
...
Unused.
Acked-by: Sam Ravnborg <sam@ravnborg.org >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-31 14:32:56 -05:00
Evan Quan
59de58f84f
drm/amd/powerplay: determine the features to enable by pptable only
...
Per current logics, the features to enable are determined together
by driver and pptable. This is not efficient in co-debug with
firmware team.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:35 -05:00
Hawking Zhang
861324983d
drm/amdgpu: correct irq type used for sdma ecc
...
we should pass irq type, instead of irq client id,
to irq_get/put interface
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com >
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:35 -05:00
Evan Quan
b4af964e75
drm/amd/powerplay: make power limit retrieval as asic specific
...
The power limit retrieval should be done per asic. Since we may
need to lookup in the pptable and that's really asic specific.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:35 -05:00
Evan Quan
1f23cadbe0
drm/amd/powerplay: correct arcturus current clock level calculation
...
There may be 1Mhz delta between target and actual frequency. That
should be taken into consideration for current level check.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:35 -05:00
Evan Quan
60d435b73d
drm/amd/powerplay: support UMD PSTATE settings on arcturus
...
Enable arcturus UMD PSTATE support.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:35 -05:00
Evan Quan
4bf76e60b9
drm/amd/powerplay: fix arcturus real-time clock frequency retrieval
...
Make sure we can still get the accurate gfxclk/uclk/socclk frequency
even on dpm disabled.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Kevin Wang
790ef68afc
drm/amd/powerplay: remove redundancy debug log in smu
...
remove redundacy debug log in smu.
eg:
[ 6897.969447] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6897.969448] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024114] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024151] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6899.024152] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078296] amdgpu: [powerplay] Unsupported SMU message: 38
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078332] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6900.078333] amdgpu: [powerplay] smu 11 clk dpm feature 1 is not enabled
[ 6901.133230] amdgpu: [powerplay] Unsupported SMU message: 38
Signed-off-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
8a856ced35
drm/amd/powerplay: correct the bitmask used in arcturus
...
Those bitmask prefixed by "SMU_" should be used.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
55bf7e6243
drm/amd/powerplay: add missing arcturus feature maps
...
Add missing feature maps for arcturus.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
d427cf8f7f
drm/amd/powerplay: support fan speed retrieval on arcturus
...
Support arcturus fan speed retrieval.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
631807f091
drm/amd/powerplay: support real-time clock retrieval on arcturus
...
Enable arcturus real-time clock retrieval.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
ba74c8bf88
drm/amd/powerplay: support sensor reading on arcturus
...
Support sensor reading for gpu loading, power and
temperatures.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
832a7062a0
drm/amd/powerplay: init arcturus SMU metrics table on bootup
...
Initialize arcturus SMU metrics table.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kevin Wang <kevin1.wang@amd.com >
Reviewed-by: Alex Deucher <alexander.deucher@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
1f96ecef6f
drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
...
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
5fa790f6c9
drm/amd/powerplay: correct Navi10 VCN powergate control (v2)
...
No VCN DPM bit check as that's different from VCN PG. Also
no extra check for possible double enablement/disablement
as that's already done by VCN.
v2: check return value of smu_feature_set_enabled
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
bf2bf52383
drm/amd/powerplay: support VCN powergate status retrieval for SW SMU
...
Commonly used for VCN powergate status retrieval for SW SMU.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
ab9e314886
drm/amd/powerplay: support VCN powergate status retrieval on Raven
...
Enable VCN powergate status report on Raven.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00
Evan Quan
9829e3d89b
drm/amd/powerplay: add new sensor type for VCN powergate status
...
VCN is widely used in new ASICs and different from tranditional
UVD and VCE.
Signed-off-by: Evan Quan <evan.quan@amd.com >
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com >
Signed-off-by: Alex Deucher <alexander.deucher@amd.com >
2019-07-30 23:48:34 -05:00