linux

Author	SHA1	Message	Date
Haohui Mai	428f273cbb	drm/amdgpu: Fix out-of-bound access for gfx_v10_0_ring_test_ib() The gfx_v10_0_ring_test_ib() function uses 20 bytes instead of 16 bytes during the test. The patch sets the size of the allocation to be 4-byte larger to match the actual usage. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:09 -04:00
Haohui Mai	ca5d251b3b	drm/amdgpu/sdma: Remove redundant lower_32_bits() calls when settings SDMA doorbell Updated the patch for the pre-vega hardware. I kept the clamping code to be safe. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:06 -04:00
Haohui Mai	7dba6e838e	drm/amdgpu/sdma: Fix incorrect calculations of the wptr of the doorbells This patch fixes the issue where the driver miscomputes the 64-bit values of the wptr of the SDMA doorbell when initializing the hardware. SDMA engines v4 and later on have full 64-bit registers for wptr thus they should be set properly. Older generation hardwares like CIK / SI have only 16 / 20 / 24bits for the WPTR, where the calls of lower_32_bits() will be removed in a following patch. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Haohui Mai <ricetons@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:44:01 -04:00
Tom Rix	9714d357e2	drm/radeon: change cac_weights_* to static Sparse reports these issues si_dpm.c:332:26: warning: symbol 'cac_weights_pitcairn' was not declared. Should it be static? si_dpm.c:1088:26: warning: symbol 'cac_weights_oland' was not declared. Should it be static? Both of these variables are only used in si_dpm.c. Single file variables should be static, so change their storage-class specifiers to static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:43:57 -04:00
Tom Rix	790d8e8ecb	drm/radeon: change cik_default_state table from global to static Sparse reports these issues cik_blit_shaders.c:31:11: warning: symbol 'cik_default_state' was not declared. Should it be static? cik_blit_shaders.c:246:11: warning: symbol 'cik_default_size' was not declared. Should it be static? cik_default_state and cik_default_size are only used in cik.c. Single file symbols should be static. So move their definitions to cik_blit_shaders.h and change their storage-class-specifier to static. Remove unneeded cik_blit_shader.c Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:43:51 -04:00
Randy Dunlap	4ae182de39	drm/amd/display: fix non-kernel-doc comment warnings Fix kernel-doc warnings for a comment that should not use kernel-doc notation: dmub_psr.c:235: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Set PSR power optimization flags. dmub_psr.c:235: warning: missing initial short description on line: * Set PSR power optimization flags. Fixes: `e5dfcd2727` ("drm/amd/display: dc_link_set_psr_allow_active refactoring") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Robin Chen <po-tchen@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Anthony Koo <Anthony.Koo@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:43:08 -04:00
Philip Yang	601354f344	drm/amdkfd: Update mapping if range attributes changed Change SVM range mapping flags or access attributes don't trigger migration, if range is already mapped on GPUs we should update GPU mapping and pass flush_tlb flag true to amdgpu vm. Change SVM range preferred_loc or migration granularity don't need update GPU mapping, skip the validate_and_map. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:42:59 -04:00
Philip Yang	6b9c63a6eb	drm/amdkfd: Add SVM range mapped_to_gpu flag To avoid unnecessary unmap SVM range from GPUs if range is not mapped on GPUs when migrating the range. This flag will also be used to flush TLB when updating the existing mapping on GPUs. It is protected by prange->migrate_mutex and mmap read lock in MMU notifier callback. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-26 11:42:44 -04:00
Aric Cyr	398bb28389	drm/amd/display: 3.2.183 This version brings along following fixes: - Keep tracking of DSC packed PPS for future use - Maintain current link settings in link loss interrupt - Remove DDC write and read size check - Read PSR-SU cap DPCD for specific panel - Don't pass HostVM by default on DCN3.1 - Reset cached PSR parameters after hibernate - Add audio readback registers - Update dcn315 clk table read Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:12:04 -04:00
Ilya Bakoulin	9844792ec8	drm/amd/display: Keep track of DSC packed PPS [Why] Store current packed PPS data in dc_stream_state for future use. Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:11:54 -04:00
Dillon Varone	3c54074504	drm/amd/display: Remove unused integer Integer no longer needed. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:11:35 -04:00
Gary Li	9fbfeaf110	drm/amd/display: Maintain current link settings in link loss interrupt [Why] DP compliance test case 400.3.2.3 is failed because in link loss interrupt the current link settings is not used in the DP link training. [How] In link loss interrupt, use the current link settings in the following DP link training. Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Gary Li <garyli12@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:11:27 -04:00
Leo Ma	e953cd08d7	drm/amd/display: Remove ddc write and read size checking [Why] Customer found I2C over AUX using ADL_Display_DDCBlockAccess_Get will fail when sending more than 256 bytes of data; [How] Remove the write and read size checking to allow sending data more than 256 bytes; Reviewed-by: Martin Leung <Martin.Leung@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Leo Ma <hanghong.ma@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:11:13 -04:00
David Zhang	d9f442e9a2	drm/amd/display: read PSR-SU cap DPCD for specific panel [why & how] For some specific eDP panel, we'd check the PSR-SU cap during boot by reading the vendor specific DPCD, otherwise it will cause to false report the eDP panel which supports PSR-SU as an non-PSR-SU panel. - add the vendor specific DPCD address in ddc_service_types header - if specific eDP panel detected, check vendor specific DPCD for PSR-SU cap Reviewed-by: Aurabindo Jayamohanan Pillai <Aurabindo.Pillai@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: David Zhang <dingchen.zhang@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:10:43 -04:00
Michael Strauss	4a0caac06a	drm/amd/display: Don't pass HostVM by default on DCN3.1 [WHY] Roll back previous change to stop passing this value by default, instead add a debug flag to override to previous behaviour (or force HostVM calcs) Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Michael Strauss <michael.strauss@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:10:36 -04:00
Evgenii Krasnikov	d2069326d2	drm/amd/display: Reset cached PSR parameters after hibernate [WHY] After hibernate system might be using old invalid psr_power_opt and psr_allow_active that never get reset [HOW] Reset cached Panel Self Refresh parameters when PSR is first configured for eDP in dc_link_setup_psr. Reviewed-by: Harry Vanzylldejong <harry.vanzylldejong@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Evgenii Krasnikov <Evgenii.Krasnikov@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:10:29 -04:00
Ilya Bakoulin	e955b54732	drm/amd/display: Add Audio readback registers [Why] Can be useful for verifying the correctness of audio output. Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Ilya Bakoulin <Ilya.Bakoulin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:10:17 -04:00
Dmytro Laktyushkin	89c342a966	drm/amd/display: update dcn315 clk table read Clean up the sequence by making sure clk_mgr always builds a reasonable clock table regardless of what we read from smu by moving all defaults from resource soc struct to clk_mgr. Now the only thing resource soc update does is read the clock table and apply any DC specific policy decisions to how clocks are populated in dml soc. Reviewed-by: Qingqing Zhuo <Qingqing.Zhuo@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:06:16 -04:00
Aric Cyr	259f249c4b	drm/amd/display: 3.2.182 This version brings along following improvements: - Fix HDCP QUERY Error for eDP and Tiled - Insert smu busy status before sending another request Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:06:10 -04:00
Mustapha Ghaddar	84ebd73e32	drm/amd/display: Fix HDCP QUERY Error for eDP and Tiled [WHY] For dio_output_encoder ID we are relying on SW concept which is invisible to HW [HOW] Needed to create separate cases for when DPIA and non DPIA for dio link encoder ID Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Reviewed-by: James Zhang <james.zhang@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:06:04 -04:00
Oliver Logush	721af39f00	drm/amd/display: Insert smu busy status before sending another request [why] Need to check if result register is busy before sending another request [how] Call method to check if result register is busy Reviewed-by: Charlene Liu <Charlene.Liu@amd.com> Acked-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Oliver Logush <oliver.logush@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:05:58 -04:00
Felix Kuehling	c3eb12dff0	drm/amdkfd: Ignore bogus signals from MEC efficiently MEC firmware sometimes sends signal interrupts without a valid context ID on end of pipe events that don't intend to signal any HSA signals. This triggers the slow path in kfd_signal_event_interrupt that scans the entire event page for signaled events. Detect these signals in the top half interrupt handler to stop processing them as early as possible. Because we now always treat event ID 0 as invalid, reserve that ID during process initialization. v2: Update firmware version checks to support more GPUs Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:05:48 -04:00
Haowen Bai	b3ef3205bc	drm/amdgpu: Remove useless kfree After alloc fail, we do not need to kfree. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-25 17:05:38 -04:00
David Yu	a2443ef0a8	drm/amdgpu: Ta fw needs to be loaded for SRIOV aldebaran Load ta fw during psp_init_sriov_microcode to enable XGMI. It is required to be loaded by both guest and host starting from Arcturus. Cap fw needs to be loaded first. Signed-off-by: David Yu <David.Yu@amd.com> Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:51:51 -04:00
Evan Quan	114f088727	drm/amd/pm: fix the deadlock issue observed on SI The adev->pm.mutx is already held at the beginning of amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce. But on their calling path, amdgpu_display_bandwidth_update will be called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They will then try to acquire the same adev->pm.mutex and deadlock will occur. By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex protection(considering logically they do not need such protection) and restructuring the call flow accordingly, we can eliminate the deadlock issue. This comes with no real logics change. Fixes: `3712e7a494` ("drm/amd/pm: unified lock protections in amdgpu_dpm.c") Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Arthur Marsh <arthur.marsh@internode.on.net> Link: https://lore.kernel.org/all/9e689fea-6c69-f4b0-8dee-32c4cf7d8f9c@molgen.mpg.de/ BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:23 -04:00
Tao Zhou	b3c76814ce	drm/amdgpu: add RAS fatal error interrupt handler The fatal error handler is independent from general ras interrupt handler since there is no related IH ring. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:18 -04:00
Tao Zhou	66f8794961	drm/amdgpu: add RAS poison consumption handler (v2) Add support for general RAS poison consumption handler. v2: remove callback function for poison consumption. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:13 -04:00
Tao Zhou	50a7d025ca	drm/amdgpu: add RAS poison creation handler (v2) Prepare for the implementation of poison consumption handler. v2: separate umc handler from poison creation. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:50:07 -04:00
Yang Wang	cc9d82fc96	drm/amdkfd: use kvcalloc() instead of kvmalloc() in kfd_migrate simplify programming with existing functions. Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-22 14:49:54 -04:00
Bokun Zhang	e15c9d06e9	drm/amd/amdgpu: Update PF2VF header - In the latest version of the header, there is a variable name change. This should not cause any backward compatibility since the variable is at the same offset in the struct. Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:14 -04:00
Bokun Zhang	451913e980	drm/amd/amdgpu: Properly indent PF2VF header - Clean up the identation in the header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:09 -04:00
Bokun Zhang	c649287aba	drm/amd/amdgpu: Update MIT license in SRIOV msg header - Update MIT license header Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 16:00:01 -04:00
Alex Deucher	72f05e3b96	drm/amdgpu/display: make hubp31_program_extended_blank static It's not used outside of dcn31_hubp.c. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:57 -04:00
Miaoqian Lin	e4f1e3a282	drm/amd/display: Fix memory leak in dcn21_clock_source_create When dcn20_clk_src_construct() fails, we need to release clk_src. Fixes: `6f4e6361c3` ("drm/amd/display: Add Renoir resource (v2)") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:52 -04:00
Haowen Bai	754fc1824b	drm/amd/display: Remove useless code aux_rep only memset but no use at all, so we drop it. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:46 -04:00
Alex Deucher	4020c22802	drm/amdgpu: don't runtime suspend if there are displays attached (v3) We normally runtime suspend when there are displays attached if they are in the DPMS off state, however, if something wakes the GPU we send a hotplug event on resume (in case any displays were connected while the GPU was in suspend) which can cause userspace to light up the displays again soon after they were turned off. Prior to commit `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."), the driver took a runtime pm reference when the fbdev emulation was enabled because we didn't implement proper shadowing support for vram access when the device was off so the device never runtime suspended when there was a console bound. Once that commit landed, we now utilize the core fb helper implementation which properly handles the emulation, so runtime pm now suspends in cases where it did not before. Ultimately, we need to sort out why runtime suspend in not working in this case for some users, but this should restore similar behavior to before. v2: move check into runtime_suspend v3: wake ups -> wakeups in comment, retain pm_runtime behavior in runtime_idle callback Fixes: `087451f372` ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.") Link: https://lore.kernel.org/r/20220403132322.51c90903@darkstar.example.org/ Tested-by: Michele Ballabio <ballabio.m@gmail.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:59:37 -04:00
Lang Yu	515d7cebc2	Revert "drm/amdkfd: only allow heavy-weight TLB flush on some ASICs for SVM too" This reverts commit `36bf93216e`. It causes SVM regressions on Vega10 with XNACK-ON. Just revert it at the moment. ./kfdtest --gtest_filter=KFDSVMRangeTest.MigratePolicyTest Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Philip Yang<Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:31 -04:00
Candice Li	e50d9ba0d2	drm/amdgpu: Add debugfs TA load/unload/invoke support v1: Add debugfs support to load/unload/invoke TA in runtime. v2: 1. Update some variables to static. 2. Use PAGE_ALIGN to calculate shared buf size directly. 3. Remove fp check. 4. Update debugfs from read to write. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:22 -04:00
Candice Li	fe96e5636a	drm/amdgpu: Use indirect buffer and save response status for TA load/invoke The upcoming TA debugfs interface needs to use indirect buffer when performing TA invoke and check psp response status for TA load and invoke. Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 15:58:10 -04:00
David Yat Sin	747eea0732	drm/amdkfd: CRIU add support for GWS queues Add support to checkpoint/restore GWS (Global Wave Sync) queues. Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 11:32:38 -04:00
David Yat Sin	ab4d51d47f	drm/amdkfd: Fix GWS queue count dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated each time the queue gets evicted. Fixes: `b8020b0304` ("drm/amdkfd: Enable over-subscription with >1 GWS queue") Signed-off-by: David Yat Sin <david.yatsin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-21 11:30:42 -04:00
Tales Lelo da Aparecida	4ae6eeed93	MAINTAINERS: add docs entry to AMDGPU To make sure maintainers of amdgpu drivers are aware of any changes in their documentation, add its entry to MAINTAINERS. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Tales Lelo da Aparecida	6954e5baa0	Documentation/gpu: Add entries to amdgpu glossary Add missing acronyms to the amdgppu glossary. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1939 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Tom Rix	79847f13a0	drm/radeon/kms: change evergreen_default_state table from global to static evergreen_default_state and evergreen_default_size are only used in evergreen.c. Single file symbols should be static. So move their definitions to evergreen_blit_shaders.h and change their storage-class-specifier to static. Remove unneeded evergreen_blit_shader.c evergreen_ps/vs definitions were removed with commit `4f86296758` ("drm/radeon/kms: remove r6xx+ blit copy routines") So their declarations in evergreen_blit_shader.h are not needed, so remove them. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Tom Rix	3eccf76c2d	drm/amd/display: add virtual_setup_stream_attribute decl to header Smatch reports this issue virtual_link_hwss.c:32:6: warning: symbol 'virtual_setup_stream_attribute' was not declared. Should it be static? virtual_setup_stream_attribute is only used in virtual_link_hwss.c, but the other functions in the file are declared in the header file and used elsewhere. For consistency, add the virtual_setup_stream_attribute decl to virtual_link_hwss.h. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Keita Suzuki	f3fa2becf2	drm/amd/pm: fix double free in si_parse_power_table() In function si_parse_power_table(), array adev->pm.dpm.ps and its member is allocated. If the allocation of each member fails, the array itself is freed and returned with an error code. However, the array is later freed again in si_dpm_fini() function which is called when the function returns an error. This leads to potential double free of the array adev->pm.dpm.ps, as well as leak of its array members, since the members are not freed in the allocation function and the array is not nulled when freed. In addition adev->pm.dpm.num_ps, which keeps track of the allocated array member, is not updated until the member allocation is successfully finished, this could also lead to either use after free, or uninitialized variable access in si_dpm_fini(). Fix this by postponing the free of the array until si_dpm_fini() and increment adev->pm.dpm.num_ps everytime the array member is allocated. Signed-off-by: Keita Suzuki <keitasuzuki.park@sslab.ics.keio.ac.jp> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Tales Lelo da Aparecida	a26b9e0b9b	drm/amd/display: make hubp1_wait_pipe_read_start() static It's a local function, let's make it static. AGD: remove prototype in dcn10_hubp.h Signed-off-by: Tales Lelo da Aparecida <tales.aparecida@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Darren Powell	f24044bd9b	amdgpu/pm: Clarify documentation of error handling in send_smc_mesg Clarify the smu_cmn_send_smc_msg_with_param documentation to mention two cases exist where messages are silently dropped with no error returned. These cases occur in unusual situations where either: 1. the message type is not allowed to a virtual GPU, or 2. a PCI recovery is underway and the HW is not yet in sync with the SW For more details see commit `4ea5081c82` ("drm/amd/powerplay: enable SMC message filter") commit `bf36b52e78` ("drm/amdgpu: Avoid accessing HW when suspending SW state") (v2) Reworked with suggestions from Luben & Paul (v3) Updated wording as per Luben's feedback Corrected error stating all messages denied on virtual GPU (each GPU has mask of which messages are allowed) Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Huang Rui	eea5c7b339	drm/amdgpu/pm: fix the null pointer while the smu is disabled It needs to check if the pp_funcs is initialized while release the context, otherwise it will trigger null pointer panic while the software smu is not enabled. [ 1109.404555] BUG: kernel NULL pointer dereference, address: 0000000000000078 [ 1109.404609] #PF: supervisor read access in kernel mode [ 1109.404638] #PF: error_code(0x0000) - not-present page [ 1109.404657] PGD 0 P4D 0 [ 1109.404672] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 1109.404701] CPU: 7 PID: 9150 Comm: amdgpu_test Tainted: G OEL 5.16.0-custom #1 [ 1109.404732] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 1109.404765] RIP: 0010:amdgpu_dpm_force_performance_level+0x1d/0x170 [amdgpu] [ 1109.405109] Code: 5d c3 44 8b a3 f0 80 00 00 eb e5 66 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 08 4c 8b b7 f0 7d 00 00 <49> 83 7e 78 00 0f 84 f2 00 00 00 80 bf 87 80 00 00 00 48 89 fb 0f [ 1109.405176] RSP: 0018:ffffaf3083ad7c20 EFLAGS: 00010282 [ 1109.405203] RAX: 0000000000000000 RBX: ffff9796b1c14600 RCX: 0000000002862007 [ 1109.405229] RDX: ffff97968591c8c0 RSI: 0000000000000001 RDI: ffff9796a3700000 [ 1109.405260] RBP: ffffaf3083ad7c50 R08: ffffffff9897de00 R09: ffff979688d9db60 [ 1109.405286] R10: 0000000000000000 R11: ffff979688d9db90 R12: 0000000000000001 [ 1109.405316] R13: ffff9796a3700000 R14: 0000000000000000 R15: ffff9796a3708fc0 [ 1109.405345] FS: 00007ff055cff180(0000) GS:ffff9796bfdc0000(0000) knlGS:0000000000000000 [ 1109.405378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1109.405400] CR2: 0000000000000078 CR3: 000000000a394000 CR4: 00000000000506e0 [ 1109.405434] Call Trace: [ 1109.405445] <TASK> [ 1109.405456] ? delete_object_full+0x1d/0x20 [ 1109.405480] amdgpu_ctx_set_stable_pstate+0x7c/0xa0 [amdgpu] [ 1109.405698] amdgpu_ctx_fini.part.0+0xcb/0x100 [amdgpu] [ 1109.405911] amdgpu_ctx_do_release+0x71/0x80 [amdgpu] [ 1109.406121] amdgpu_ctx_ioctl+0x52d/0x550 [amdgpu] [ 1109.406327] ? _raw_spin_unlock+0x1a/0x30 [ 1109.406354] ? drm_gem_handle_delete+0x81/0xb0 [drm] [ 1109.406400] ? amdgpu_ctx_get_entity+0x2c0/0x2c0 [amdgpu] [ 1109.406609] drm_ioctl_kernel+0xb6/0x140 [drm] Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:32 -04:00
Lang Yu	36bf93216e	drm/amdkfd: only allow heavy-weight TLB flush on some ASICs for SVM too The idea is from commit `a50fe70780` ("drm/amdkfd: Only apply heavy-weight TLB flush on Aldebaran") and commit `f61c40c075` ("drm/amdkfd: enable heavy-weight TLB flush on Arcturus"). At the moment, heavy-weight TLB could cause problems on ASICs except Aldebaran and Arcturus. A simple hipMallocManaged/hipFree program could trigger this issue. [ 97.787657] amdgpu 0000:01:00.0: amdgpu: wait for kiq fence error: 0. [ 106.868758] amdgpu: qcm fence wait loop timeout expired [ 106.868966] amdgpu: The cp might be in an unrecoverable state due to an unsuccessful queues preemption [ 106.869203] amdgpu: Failed to evict process queues [ 106.869261] amdgpu: Failed to quiesce KFD Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-04-19 13:58:07 -04:00

1 2 3 4 5 ...

1075681 Commits