Commit Graph

72934 Commits

Author SHA1 Message Date
Alex Deucher
e3e984ee43 drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m
Need to check the module variant as well.

Acked-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:25:27 -04:00
Christian König
8b1c715fc8 drm/radeon: keep __user during cast
Silence static checker warning.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:25:24 -04:00
Christian König
12bfc0156e drm/radeon: fix AGP dependency
When AGP is compiled as module radeon must be compiled as module as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:25:21 -04:00
Christian König
b503138e49 drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table
Otherwise we will run into a NULL ptr deref.

Signed-off-by: Christian König <christian.koenig@amd.com>
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:25:15 -04:00
Evan Quan
0b4e90632d drm/amd/pm: correct the watermark settings for Polaris
The "/ 10" should be applied to the right-hand operand instead of
the left-hand one.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Noticed-by: Georgios Toptsidis <gtoptsid@gmail.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:25:11 -04:00
shaoyunl
c8941550aa drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac
This recent change introduce SDMA interrupt info printing with irq->process function.
These functions do not require a set function to enable/disable the irq

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:11:03 -04:00
Kenneth Feng
be6523e3a9 drm/amd/pm: bug fix for pcie dpm
Currently the pcie dpm has two problems.
1. Only the high dpm level speed/width can be overrided
if the requested values are out of the pcie capability.
2. The high dpm level is always overrided though sometimes
it's not necesarry.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:11:01 -04:00
Jonathan Kim
3f1d1eb2a2 drm/amdgpu: add ih waiter on process until checkpoint
Add IH function to allow caller to wait until ring entries are processed
until the checkpoint write pointer.

This will be primarily used by HMM to drain pending page fault interrupts
before memory unmap to prevent HMM from handling stale interrupts.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:57 -04:00
Zhan Liu
0dd7953234 drm/amdgpu/display: Implement functions to let DC allocate GPU memory
[Why]
DC needs to communicate with PM FW through GPU memory. In order
to do so we need to be able to allocate memory from within DC.

[How]
Call amdgpu_bo_create_kernel to allocate GPU memory and use a
list in amdgpu_display_manager to track our allocations so we
can clean them up later.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Acked-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:49 -04:00
Zhan Liu
89551f2387 drm/amdgpu/display: Use wm_table.entries for dcn301 calculate_wm
[Why]
For DGPU Navi, the wm_table.nv_entries are used. These entires are not
populated for DCN301 Vangogh APU, but instead wm_table.entries are.

[How]
Use DCN21 Renoir style wm calculations.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Zhan Liu <zhan.liu@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Zhan Liu <zhan.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:44 -04:00
Nirmoy Das
e5e6666db5 drm/amdgpu: fb BO should be ttm_bo_type_device
FB BO should not be ttm_bo_type_kernel type and
amdgpufb_create_pinned_object() pins the FB BO anyway.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:42 -04:00
shaoyunl
e3c1b0712f drm/amdgpu: Reset the devices in the XGMI hive duirng probe
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices
without sync to each other. This could cause device hang since for XGMI configuration, all the devices
within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue
by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't
do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init
to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:38 -04:00
shaoyunl
655ce9cb13 drm/amdgpu: Add reset_list for device list used for reset
The gmc.xgmi.head list originally is designed for device list in the XGMI hive. Mix use it
for reset purpose will prevent the reset function to adjust XGMI device list which is required
in next change

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Andrey Grodzovsky andrey.grodzovsky@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:35 -04:00
shaoyunl
a330b52a9e drm/amdgpu: Init the cp MQD if it's not be initialized before
The MQD might not be initialized duirng first init period if the device need to be reset
druing probe. Driver need to proper init them in gpu recovery period

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:33 -04:00
shaoyunl
8e2712e71b drm/amdgpu: Add kfd init_complete flag to check from amdgpu side
amdgpu driver may be in reset state during init which will not initialize the kfd,
driver need to initialize the KFD after reset by check the flag

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:10:28 -04:00
Kevin Wang
03597b47d6 Revert "drm/amdgpu: add psp RAP L0 check support"
This reverts commit d86fd724e5.

Disable PSP RAP L0 self test until to RAP feature ready.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:45 -04:00
Mark Yacoub
f258907fdd drm/amdgpu: Verify bo size can fit framebuffer size on init.
To initialize the framebuffer, call drm_gem_fb_init_with_funcs which
verifies that the BO size can fit the FB size by calculating the minimum
expected size of each plane.

The bug was caught using igt-gpu-tools test: kms_addfb_basic.too-high
and kms_addfb_basic.bo-too-small

Tested on ChromeOS Zork by turning on the display and running a YT
video.

=== Changes from v1 ===
1. Added new line under declarations.
2. Use C style comment.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Mark Yacoub <markyacoub@chromium.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:36 -04:00
Zhang Yunkai
c153401672 drm/amd/display: remove duplicate include in dcn21 and gpio
'dce110_resource.h' included in 'dcn21_resource.c' is duplicated.
'hw_gpio.h' included in 'hw_factory_dce110.c' is duplicated.

Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:32 -04:00
Zhang Yunkai
51713e4e54 drm/amd/display: remove duplicate include in amdgpu_dm.c
'drm/drm_hdcp.h' included in 'amdgpu_dm.c' is duplicated.
It is also included in the 79th line.

Signed-off-by: Zhang Yunkai <zhang.yunkai@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:24 -04:00
Jia-Ju Bai
692bd2a02e drm/amdgpu/swsmu: fix error return code of smu_v11_0_set_allowed_mask()
When bitmap_empty() or feature->feature_num triggers an error,
no error return code of smu_v11_0_set_allowed_mask() is assigned.
To fix this bug, ret is assigned with -EINVAL as error return code.

Reviewed-by: Evan Quan <evan.quan@amd.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:18 -04:00
Joshua Aberback
554ba183b1 drm/amd/display: Align cursor cache address to 2KB
[Why]
The registers for the address of the cursor are aligned to 2KB, so all
cursor surfaces also need to be aligned to 2KB. Currently, the
provided cursor cache surface is not aligned, so we need a workaround
until alignment is enforced by the surface provider.

[How]
 - round up surface address to nearest multiple of 2048
 - current policy is to provide a much bigger cache size than
   necessary,so this operation is safe

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:12 -04:00
Sung Lee
c54a6fe437 drm/amd/display: Revert dram_clock_change_latency for DCN2.1
[WHY & HOW]
Using values provided by DF for latency may cause hangs in
multi display configurations. Revert change to previous value.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Sung Lee <sung.lee@amd.com>
Reviewed-by: Haonan Wang <Haonan.Wang2@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:03 -04:00
Aric Cyr
04841b934c drm/amd/display: 3.2.126
DC version 3.2.126 brings improvements in multiple areas.
In summary, we highlight:

- DMUB fixes
- Firmware relase 0.0.55
- Expanded dmub_cmd documentation
- Enhancements in DCN30

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:03:00 -04:00
Jake Wang
06ddcee49a drm/amd/display: Added multi instance support for panel control
[Why]
Panel control always programs instance 0. With multi eDP we need to
support multiple instances.

[How]
Use link index to set different instances for panel control.
Refactored LVTMA control to support multiple instances.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Jake Wang <haonan.wang2@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:53 -04:00
Anthony Koo
1a595f28ea drm/amd/display: [FW Promotion] Release 0.0.55
Add comments to better describe the function of different cmds
and parameters in the dmub interface

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:49 -04:00
Yongqiang Sun
6804287bd1 drm/amd/display: Fixed read/write pointer issue for get dmub trace
[Why]
Driver get wrap around dmub trace data due to read pointer being
increased incorrectly when there are multiple interrupt
queues with very short interval

[How]
Check read/write pointer before copying data from ring buffer

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:46 -04:00
Qingqing Zhuo
61a74712c8 drm/amd/display: Fix warning
[Why]
- Wrong scope for ifdef
- Missing struct description

[How]
Move ifdef and add comment

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:43 -04:00
Yongqiang Sun
3c934f454d drm/amd/display: Read all the trace entry if it is not empty
[Why]
If interval of two interrupt from dmub outbox0 is too short,
some event might be skipped

[How]
Compare read pointer and write pointer until all the event
entry is processed

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:38 -04:00
Qingqing Zhuo
0c66824be8 drm/amd/display: Enable pflip interrupt upon pipe enable
[Why]
pflip interrupt would not be enabled promptly if a pipe is disabled
and re-enabled, causing flip_done timeout error during DP
compliance tests

[How]
Enable pflip interrupt upon pipe enablement

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:33 -04:00
Yongqiang Sun
d829303c5b drm/amd/display: Fix dmub trace event not update issue
[Why & How]
Reference to read pointer which is incorrect.
Change to reference to write pointer.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:30 -04:00
Yongqiang Sun
6b66208f0c drm/amd/display: Move define from internal header to dmub_cmd.h
[Why & How]
Fix linux compile error

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:27 -04:00
Martin Leung
b12f60ac49 drm/amd/display: Fix typo when retrieving dppclk from UEFI config
[why]
In some boot configurations we need to retrieve the currently
UEFI-set dppclk, but there was a typo in the calculation

[how]
Fix typo to make dpp_clk calculate off dpp_clk divider instead of
disp_clk

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Martin Leung <martin.leung@amd.com>
Reviewed-by: Sung Lee <Sung.Lee@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:23 -04:00
Martin Leung
d3cf9fa6ba drm/amd/display: Skip powerstate DC hw access if virtual dal
[Why]
On baco-enabled systems running virtual dal, can get set power
state when hw is not initialized

[How]
Skip DC hw part of setPowerState when hw not available

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Martin Leung <martin.leung@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:11 -04:00
Dillon Varone
ffe5650af0 drm/amd/display: Enabled pipe harvesting in dcn30
[Why & How]
Ported logic from dcn21 for reading in pipe fusing to dcn30.
Supported configurations are 1 and 6 pipes. Invalid fusing
will revert to 1 pipe being enabled.

Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Eryk Brol <eryk.brol@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:07 -04:00
Lijo Lazar
f78313fae9 drm/amdgpu: Check if FB BAR is enabled for ROM read
Some configurations don't have FB BAR enabled. Avoid reading ROM image
from FB BAR region in such cases.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:02:05 -04:00
Lijo Lazar
a364782f49 drm/amd/pm: Remove min/max overload of pp_dpm_sclk
To maintain consistency with legacy usage, remove min/max clock overload
of pp_dpm_sclk node.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:59 -04:00
Lijo Lazar
775f11aa17 drm/amd/pm: Enable pp_od_clk_voltage node on aldebaran
Use pp_od_clk_voltage node to enable performance determinism and GFX
clock min/max range for aldebaran. This is to avoid overload of
pp_dpm_sclk and maintain consistency in user lib interfaces.

Ex: To enable perf determinism at 900MHz max gfx clock

1) echo perf_determinism > /sys/bus/pci/devices/.../power_dpm_force_performance_level
2) echo s 1 900 > /sys/bus/pci/devices/.../pp_od_clk_voltage
3) echo c > /sys/bus/pci/devices/.../pp_od_clk_voltage

Ex: To enable min 500MHz/max 900MHz gfx clocks

1) echo manual > "/sys/bus/pci/devices/.../power_dpm_force_performance_level"
2) echo s 0 500 > "/sys/bus/pci/devices/.../pp_od_clk_voltage"
3) echo s 1 900 > "/sys/bus/pci/devices/.../pp_od_clk_voltage”
4) echo c > "/sys/bus/pci/devices/.../pp_od_clk_voltage”

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:51 -04:00
Shashank Sharma
e36ccf9a96 drm/amdgpu: Set GTT_USWC flag to enable freesync v2
This patch sets 'AMDGPU_GEM_CREATE_CPU_GTT_USWC' as input
parameter flag, during object creation of an imported DMA
buffer.

In absence of this flag:
1. Function amdgpu_display_supported_domains() doesn't add
   AMDGPU_GEM_DOMAIN_GTT as supported domain.
2. Due to which, Function amdgpu_display_user_framebuffer_create()
   refuses to create framebuffer for imported DMA buffers.
3. Due to which, AddFB() IOCTL fails.
4. Due to which, amdgpu_present_check_flip() check fails in DDX
5. Due to which DDX driver doesn't allow flips (goes to blitting)
6. Due to which setting Freesync/VRR property fails for PRIME buffers.

So, this patch finally enables Freesync with PRIME buffer offloading.

v2 (chk): instead of just checking the flag we copy it over if the
          exporter is an amdgpu device as well.

Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:42 -04:00
Shashank Sharma
0b46bc3a9d drm/amdgpu: clean-up unused variable
Variable 'bp' seems to be unused residue from previous
logic, and is not required anymore.

Cc: Koenig Christian <christian.koenig@amd.com>
Cc: Deucher Alexander <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:28 -04:00
Aurabindo Pillai
c0ea73a4ad Revert freesync video patches temporarily
This temporarily reverts freesync video patches since it causes regression with
eDP displays. This patch is a squashed revert of the following patches:

6f59f229f8 ("drm/amd/display: Skip modeset for front porch change")
d10cd527f5 ("drm/amd/display: Add freesync video modes based on preferred modes")
0eb1af2e82 ("drm/amd/display: Add module parameter for freesync video mode")

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Anson Jacob <anson.jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:01:04 -04:00
Anson Jacob
50e2fc36e7 drm/amdkfd: Fix UBSAN shift-out-of-bounds warning
If get_num_sdma_queues or get_num_xgmi_sdma_queues is 0, we end up
doing a shift operation where the number of bits shifted equals
number of bits in the operand. This behaviour is undefined.

Set num_sdma_queues or num_xgmi_sdma_queues to ULLONG_MAX, if the
count is >= number of bits in the operand.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1472

Reported-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:57 -04:00
Oak Zeng
47bfa5f60f drm/amdgpu: Increase PSP runtime TMR region size
Aldebaran uses more than 4M runtime TMR. The current
hard coded 4M TMR is not big enough for Aldebaran.
Increase it to 8M.

v2: Only do 8M size for ALDEBARAN (Hawking)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:51 -04:00
Eric Huang
c3c9e0faf4 drm/amdkfd: apply uncached flag for aldebaran
The flag is only applied on fine-grained memory.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:48 -04:00
Eric Huang
2e2f197f4c drm/amdgpu: set snoop bit in pde/pte entries for A+A
Page tables in vram mapping to cpu is changed from uncached to
cached in A+A, the snoop bit in VM_CONTEXTx_PAGE_TABLE_BASE_ADDR/
PDE0s/PDE1s/PDE2s/PTE.TFs has to be set so gpuvm walker snoop
page table data out of CPU cache.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:46 -04:00
Eric Huang
06bfc045d5 drm/amdgpu: set CPU mapping of vram as cached for A+A mode
New A+A HW supports cached vram mapped to cpu.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:44 -04:00
Dennis Li
761d86d37f drm/amdgpu: harvest edc status when connected to host via xGMI
When connected to a host via xGMI, system fatal errors may trigger
warm reset, driver has no change to query edc status before reset.
Therefore in this case, driver should harvest previous error loging
registers during boot, instead of only resetting them.

v2:
1. IP's ras_manager object is created when its ras feature is enabled,
so change to query edc status after amdgpu_ras_late_init called

2. change to enable watchdog timer after finishing gfx edc init

Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Reivewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:41 -04:00
Felix Kuehling
63dbb0db3a drm/amdgpu: Make noretry the default on Aldebaran
This is needed for best machine learning performance. XNACK can still
be enabled per-process if needed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:38 -04:00
Harish Kasiviswanathan
4464820dc7 drm/amdgpu: update default timeout of Aldebaran SQ watchdog
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Reivewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:35 -04:00
Kenneth Feng
bea9cd3f8d drm/amd/pm: add new data in metrics table
Export new data in the metrics table for gfx and memory
utilization counter, and each hbm temperature as well.

v2:
change the metrics table version to v1.1

v3:
fix the coding style
v4:
rebase against latest kernel

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:28 -04:00
Kevin Wang
d86fd724e5 drm/amdgpu: add psp RAP L0 check support
add PSP RAP L0 check when RAP TA is loaded.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-03-23 23:00:25 -04:00