Commit Graph

81174 Commits

Author SHA1 Message Date
Kent Russell
5eb877b282 drm/amdkfd: Fix ASIC name typos
Three misspelled ASICs in comments here, so fix the spelling

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Felix Kuehling
6f4cb84ae0 drm/amdkfd: Fix DQM asserts on Hawaii
start_nocpsch would never set dqm->sched_running on Hawaii due to an
early return statement. This would trigger asserts in other functions
and end up in inconsistent states.

Bug: https://github.com/RadeonOpenCompute/ROCm/issues/1624
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Harry Wentland
dc5d4aff2e drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
For some reason this file isn't using the appropriate register
headers for DCN headers, which means that on DCN2 we're getting
the VIEWPORT_DIMENSION offset wrong.

This means that we're not correctly carving out the framebuffer
memory correctly for a framebuffer allocated by EFI and
therefore see corruption when loading amdgpu before the display
driver takes over control of the framebuffer scanout.

Fix this by checking the DCE_HWIP and picking the correct offset
accordingly.

Long-term we should expose this info from DC as GMC shouldn't
need to know about DCN registers.

Cc: stable@vger.kernel.org
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Tao Zhou
15084a8e16 drm/amd/pm: only send GmiPwrDnControl msg on master die (v3)
PMFW only returns 0 on master die and sends NACK back on other dies for
the message.

v2: only send GmiPwrDnControl msg on master die instead of all
dies.
v3: remove the pointer check for get_socket_id and get_die_id as they
should be present on Aldebaran.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Guchun Chen
2096b74b1d drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt
This is observed in SRIOV case with virtual KMS as display.

_raw_spin_lock_irqsave+0x37/0x40
drm_handle_vblank+0x69/0x350 [drm]
? try_to_wake_up+0x432/0x5c0
? amdgpu_vkms_prepare_fb+0x1c0/0x1c0 [amdgpu]
drm_crtc_handle_vblank+0x17/0x20 [drm]
amdgpu_vkms_vblank_simulate+0x4d/0x80 [amdgpu]
__hrtimer_run_queues+0xfb/0x230
hrtimer_interrupt+0x109/0x220
__sysvec_apic_timer_interrupt+0x64/0xe0
asm_call_irq_on_stack+0x12/0x20

Fixes: 84ec374bd5 ("drm/amdgpu: create amdgpu_vkms (v4)")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Kelly Zytaruk <kelly.zytaruk@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Prike Liang
4eaf21b752 drm/amdgpu: not return error on the init_apu_flags
In some APU project we needn't always assign flags to identify each other,
so we may not need return an error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Felix Kuehling
b121862c78 drm/amdkfd: Use prange->update_list head for remove_list
The remove_list head was only used for keeping track of existing ranges
that are to be removed from the svms->list. The update_list was used for
new or existing ranges that need updated attributes. These two cases are
mutually exclusive (i.e. the same range will never be on both lists).
Therefore we can use the update_list head to track the remove_list and
save another 16 bytes in the svm_range struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Felix Kuehling
ef3b4137aa drm/amdkfd: Use prange->list head for insert_list
There are seven list_heads in struct svm_range: list, update_list,
remove_list, insert_list, svm_bo_list, deferred_list, child_list. This
patch and the next one remove two of them that are redundant.

The insert_list head was only used for new ranges that are not on the
svms->list yet. So we can use that list head for keeping track of
new ranges before they get added, and use list_move_tail to move them
to the svms->list when ready.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Jiasheng Jiang
abfaf0eee9 drm/amdkfd: Check for null pointer after calling kmemdup
As the possible failure of the allocation, kmemdup() may return NULL
pointer.
Therefore, it should be better to check the 'props2' in order to prevent
the dereference of NULL pointer.

Fixes: 3a87177eb1 ("drm/amdkfd: Add topology support for dGPUs")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
José Expósito
978ffac878 drm/amd/display: invalid parameter check in dmub_hpd_callback
The function performs a check on the "adev" input parameter, however, it
is used before the check.

Initialize the "dev" variable after the sanity check to avoid a possible
NULL pointer dereference.

Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Addresses-Coverity-ID: 1493909 ("Null pointer dereference")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Rajneesh Bhardwaj
8b5da5a458 Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process"
This reverts commit fbcdbfde87.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Mario Limonciello
83293f7f3d drm/amd/display: reset dcn31 SMU mailbox on failures
Otherwise future commands may fail as well leading to downstream
problems that look like they stemmed from a timeout the first time
but really didn't.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Greg Kroah-Hartman
5fea167ec0 drm/amdkfd: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the amdkfd sysfs code to use default_groups field which has
been the preferred way since aa30f47cf6 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Greg Kroah-Hartman
7ff61cdcc8 drm/amdgpu: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the amdgpu sysfs code to use default_groups field which has
been the preferred way since aa30f47cf6 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: shaoyunl <shaoyun.liu@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Tom St Denis
4cc9f86f85 drm/amd/amdgpu: Add pcie indirect support to amdgpu_mm_wreg_mmio_rlc()
The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to
MMIO registers.  It didn't support registers beyond the BAR mapped MMIO
space.  This adds pcie indirect write support.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
575e55ee4f drm/amdgpu: recover gart table at resume
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART table.

v3: remove gart recovery from other places
v2: pin gart at amdgpu_gart_table_vram_alloc()

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
ec6aae9711 drm/amdgpu: do not pass ttm_resource_manager to vram_mgr
Do not allow exported amdgpu_vram_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call vram_mgr functions.

v2: pass adev's vram_mgr instead of adev

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
ffb378fb30 drm/amdkfd: remove unused function
Remove unused amdgpu_amdkfd_get_vram_usage()

CC: Felix.Kuehling@amd.com

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: dfcbe6d5f4 ("drm/amdgpu: Remove unused function pointers")
2022-01-11 15:44:26 -05:00
Nirmoy Das
1dd8b1b987 drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call gtt_mgr functions.

v4: remove unused adev.
v3: upcast mgr from ttm resopurce manager instead of
getting it from adev.
v2: pass adev's gtt_mgr instead of adev.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Leslie Shi
62d5f9f711 drm/amdgpu: Unmap MMIO mappings when device is not unplugged
Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device
is unplugged to prevent crash in GPU initialization failure") makes call to
amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps
MMIO mappings even when device is not unplugged.

v2: Add condition of drm_dev_enter() to deleted unmaps in patch
"drm/amdgpu: Unmap all MMIO mappings"

Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Peng Ju Zhou
6638391b9f drm/amdgpu: Enable second VCN for certain Navy Flounder.
Certain Navy Flounder cards have 2 VCNs, enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
Mario Limonciello
63ad5371cd drm/amd/display: explicitly set is_dsc_supported to false before use
When UBSAN is enabled a case is shown on unplugging the display that
this variable hasn't been initialized by `update_dsc_caps`, presumably
when the display was unplugged it wasn't copied from the DPCD.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1956497
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
Jiawei Gu
b54ce6c92c drm/amdgpu: Clear garbage data in err_data before usage
Memory of err_data should be cleaned before usage
when there're multiple entry in ras ih.
Otherwise garbage data from last loop will be used.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
Nicholas Kazlauskas
19e43f1276 drm/amd/display: Add version check before using DP alt query interface
[Why]
To maintain compatibility with firmware older than 4.0.11.

Those firmware may have interrmittent hangs with RDCSPIPE or the PHY,
but we shouldn't regress their previous behavior.

[How]
Use the new path if firmware is development or 4.0.11 or newer. Use the
legacy path otherwise.

Fixes: 41f91315b5 ("drm/amd/display: Query DMCUB for dp alt status")

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:22:01 -05:00
Mikita Lipski
214993e106 drm/amd/display: introduce mpo detection flags
[why]
We want to know if new crtc state is enabling MPO configuration before
enabling it.
[how]
Detect if both primary and overlay planes are enabled on the same CRTC.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:21:53 -05:00
Charlene Liu
46a74381e5 drm/amd/display: Add check for forced_clocks debug option
[why]
driver missed the check.

[how]
add the check.
add min display clock = 100mhz check based on dccg doc.

[note]
add SetPhyclkVoltageByFreq as confirmed with smu, but not enabled in
this change.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:21:45 -05:00
Nicholas Kazlauskas
79d6b9351f drm/amd/display: Don't reinitialize DMCUB on s0ix resume
[Why]
PSP will suspend and resume DMCUB. Driver should just wait for DMCUB to
finish the auto load before continuining instead of placing it into
reset, wiping its firmware state and reinitializing.

If we don't let DMCUB fully finish initializing for S0ix then some state
will be lost and screen corruption can occur due to incorrect address
translation.

[How]
Use dmub_srv callbacks to determine in DMCUB is running and wait for
auto-load to complete before continuining.

In S0ix DMCUB will be running and DAL fw so initialize will skip.

In S3 DMCUB will not be running and we will do a full hardware init.

In S3 DMCUB will be running but will not be DAL fw so we will also do
a full hardware init.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:21:29 -05:00
Wenjing Liu
580013b2ce drm/amd/display: unhard code link to phy idx mapping in dc link and clean up
[why]
1. Current code hard codes link to PHY mapping in dc link level per asic
per revision.
This is not scalable. In long term the mapping will be obatined from
DMUB and store in dc resource.

2. Depending on DCN revision and endpoint type, the definition of
dio_output_idx dio_output_type and phy_idx are not  consistent. We need
to unify the meaning of these hardware indices across different system
configuration.

[how]
1. Temporarly move the hardcoded mapping to dc_resource level, which
should have full awareness of asic specific configuration and add a TODO
comment to move the mapping to DMUB.

2. populate dio_output_idx/phy_idx for all configuration, define
usb4_enabled bit instead of dio_output_type as an external enum.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Eric Yang <Eric.Yang2@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:21:20 -05:00
Yi-Ling Chen
771ced73fc drm/amd/display: Fix underflow for fused display pipes case
[Why]
Depend on res_pool->res_cap->num_timing_generator to query timing
gernerator information, it would case underflow at the fused display
pipes case.
Due to the res_pool->res_cap->num_timing_generator records default
timing generator resource built in driver, not the current chip.

[How]
Some ASICs would be fused display pipes less than the default setting.
In dcnxx_resource_construct function, driver would obatin real timing
generator count and store it into res_pool->timing_generator_count.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Yi-Ling Chen <Yi-Ling.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:21:05 -05:00
Mario Limonciello
eac4c54bf7 drm/amdgpu: don't set s3 and s0ix at the same time
This makes it clearer which codepaths are in use specifically in
one state or the other.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:20:55 -05:00
Mario Limonciello
e53d9665ab drm/amdgpu: explicitly check for s0ix when evicting resources
This codepath should be running in both s0ix and s3, but only does
currently because s3 and s0ix are both set in the s0ix case.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:20:38 -05:00
Evan Quan
f38b0d48ca drm/amd/pm: keep the BACO feature enabled for suspend
To pair with the workaround which always reset the ASIC in suspend.
Otherwise, the reset which relies on BACO will fail.

Fixes: daf8de0874 ("drm/amdgpu: always reset the asic in suspend (v2)")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:19:46 -05:00
James Yao
216a987319 drm/amdgpu: add dummy event6 for vega10
[why]
Malicious mailbox event1 fails driver loading on vega10.
A dummy event6 prevent driver from taking response from malicious event1 as its own.

[how]
On vega10, send a mailbox event6 before sending event1.

Signed-off-by: James Yao <yiqing.yao@amd.com>
Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:19:34 -05:00
yipechai
5b0ce2d41b drm/amdkfd: enable sdma ecc interrupt event can be handled by event_interrupt_wq_v9
Enable sdma ecc interrupt event can be handled by event_interrupt_wq_v9.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-07 17:19:14 -05:00
Dave Airlie
cb6846fbb8 Merge tag 'amd-drm-next-5.17-2021-12-30' of ssh://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.17-2021-12-30:

amdgpu:
- Suspend/resume fixes
- Fence fix
- Misc code cleanups
- IP discovery fixes
- SRIOV fixes
- RAS fixes
- GMC 8 VRAM detection fix
- FRU fixes for Aldebaran
- Display fixes

amdkfd:
- SVM fixes
- IP discovery fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211230141032.613596-1-alexander.deucher@amd.com
2021-12-31 10:59:17 +10:00
Alex Deucher
0637d41786 drm/amdgpu: no DC support for headless chips
Chips with no display hardware should return false for
DC support.

v2: drop Arcturus and Aldebaran

Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reported-by: Tareque Md.Hanif <tarequemd.hanif@yahoo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
José Expósito
f28cad86ad drm/amd/display: fix dereference before NULL check
The "plane_state" pointer was access before checking if it was NULL.

Avoid a possible NULL pointer dereference by accessing the plane
address after the check.

Addresses-Coverity-ID: 1493892 ("Dereference before null check")
Fixes: 3f68c01be9 ("drm/amd/display: add cyan_skillfish display support")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Alex Deucher
6dc8265f98 drm/amdgpu: always reset the asic in suspend (v2)
If the platform suspend happens to fail and the power rail
is not turned off, the GPU will be in an unknown state on
resume, so reset the asic so that it will be in a known
good state on resume even if the platform suspend failed.

v2: handle s0ix

Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Evan Quan
4a700546ec drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
By setting mp1_state as PP_MP1_STATE_UNLOAD, MP1 will do some proper cleanups and
put itself into a state ready for PNP. That can workaround some random resuming
failure observed on BOCO capable platforms.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Yizhuo Zhai
0726ed3065 drm/amd/display: Fix the uninitialized variable in enable_stream_features()
In function enable_stream_features(), the variable "old_downspread.raw"
could be uninitialized if core_link_read_dpcd() fails, however, it is
used in the later if statement, and further, core_link_write_dpcd()
may write random value, which is potentially unsafe.

Fixes: 6016cd9dba ("drm/amd/display: add helper for enabling mst stream features")
Cc: stable@vger.kernel.org
Signed-off-by: Yizhuo Zhai <yzhai003@ucr.edu>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Alex Deucher
937ed9c866 drm/amdgpu: fix runpm documentation
It's not only supported by HG/PX laptops.  It's supported
by all dGPUs which supports BOCO/BACO functionality (runtime
D3).

BOCO - Bus Off, Chip Off.  The entire chip is powered off.
       This is controlled by ACPI.
BACO - Bus Active, Chip Off.  The chip still shows up
       on the PCI bus, but the device itself is powered
       down.

v2: fix missed HG/PX reference

Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Marina Nikolic
11c9cc95f8 amdgpu/pm: Make sysfs pm attributes as read-only for VFs
== Description ==
Setting values of pm attributes through sysfs
should not be allowed in SRIOV mode.
These calls will not be processed by FW anyway,
but error handling on sysfs level should be improved.

== Changes ==
This patch prohibits performing of all set commands
in SRIOV mode on sysfs level.
It offers better error handling as calls that are
not allowed will not be propagated further.

== Test ==
Writing to any sysfs file in passthrough mode will succeed.
Writing to any sysfs file in ONEVF mode will yield error:
"calling process does not have sufficient permission to execute a command".

Signed-off-by: Marina Nikolic <Marina.Nikolic@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Tao Zhou
fec8c5244f drm/amdgpu: save error count in RAS poison handler
Otherwise the RAS error count couldn't be queried from sysfs.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Guchun Chen
45e3d1db7d drm/amdgpu: drop redundant semicolon
A minor typo.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Wenjing Liu
4c3adc0b84 drm/amd/display: get and restore link res map
[why]
When reboot the link res map should be persisted.  So during boot up,
driver will look at the map to determine which link should take priority
to use certain link res.  This is to ensure that link res remains
unshuffled after a reboot.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Wenjing Liu
6dd8931b1c drm/amd/display: support dynamic HPO DP link encoder allocation
[why]
When there are more DP2.0 RXs connected than the number HPO DP link
encoders we have, we need to dynamically allocate HPO DP link encoder to
the port that needs it.

[how]
Only allocate HPO DP link encoder when it is needed.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Wenjing Liu
3d38a5839e drm/amd/display: access hpo dp link encoder only through link resource
[why]
Update all accesses to use hpo dp link encoder through link resource
only.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:45 -05:00
Wenjing Liu
f3fac9481b drm/amd/display: populate link res in both detection and validation
[why]
This commit is to populate link res in preparation of the next commit.
The next commit will replace all existing code to use link res instead

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:44 -05:00
Wenjing Liu
ef30f441f6 drm/amd/display: define link res and make it accessible to all link interfaces
[why]
There will be a series of re-arch changes in Link Resource Management.
They are more and more muxable link resource objects and the resource is
insufficient for a one to one allocation to all links created.
Therefore a link resource sharing logic is required to determine which
link should use certain link resource.

This commit is the first one in this series that starts by defining a
link resource struct, this struct will be available to all interfaces
that need to perform link programming sequence.

In later commits, we will granduately decouple link resource objects out
of dc link. So instead of access a link resource from dc link. Current
link's resource can be accessible through pipe_ctx->link_res during
commit, or by calling  dc_link_get_cur_link_res function with current
link passed in after commit.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:44 -05:00
Aric Cyr
19afe66ddb drm/amd/display: 3.2.167
This version brings along the following:

- Fixes and improvements in the LTTPR code
- Improve z-state
- Fix null pointer check
- Improve communication with s0i2
- Update multiple-display split policy
- Add missing registers

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-30 08:54:44 -05:00