linux

Author	SHA1	Message	Date
Guchun Chen	a219ecbb83	drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0 When amdgpu_bad_page_threshold = 0, bad page reservation stuffs are skipped in either UMC ECC irq or page retirement calling of sync flood isr. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:20 -04:00
Guchun Chen	f848159b57	drm/amdgpu: decouple sysfs creating of bad page node Bad page information should not be exposed by sysfs when bad page retirement is disabled, so decouple it from ras sysfs group creating, and add one guard before creating. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:16 -04:00
Guchun Chen	eb0c3cd48f	drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2) Add one definition for the RAS module's FS name. It's used in both debugfs and sysfs cases. v2: Use static variable instead of macro definition. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:08 -04:00
Guchun Chen	bf0b91b78f	drm/amdgpu: restore ras flags when user resets eeprom(v2) RAS flags needs to be cleaned as well when user requires one clean eeprom. v2: RAS flags shall be restored after eeprom reset succeeds. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:27:04 -04:00
Guchun Chen	e8fbaf0342	drm/amdgpu: break GPU recovery once it's in bad state(v4) When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:54 -04:00
Guchun Chen	9c06f91ff2	drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2) Once the bad page saved to eeprom reaches the configured threshold, ras recovery will be issued to notify user. v2: Fix spelling typo. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:46 -04:00
Guchun Chen	35cd2cdadb	drm/amdgpu: skip bad page reservation once issuing from eeprom write Once the ras recovery is issued from eeprom write itself, bad page reservation should be ignored, otherwise, recursive calling of writting to eeprom would happen. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:38 -04:00
Guchun Chen	b82e65a935	drm/amdgpu: break driver init process when it's bad GPU(v5) When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:31 -04:00
Guchun Chen	1d6a9d122d	drm/amdgpu: add bad gpu tag definition This tag will be hired for bad gpu detection in eeprom's access. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:26:16 -04:00
Guchun Chen	c84d46707e	drm/amdgpu: validate bad page threshold in ras(v3) Bad page threshold value should be valid in the range between -1 and max records length of eeprom. It could determine when saved bad pages exceed threshold value, and proceed corresponding actions. v2: When using the default typical value, it should be min value between typical value and eeprom max records length. v3: drop the case of setting bad_page_cnt_threshold to be 0xFFFFFFFF, as it confuses user. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:25:58 -04:00
Guchun Chen	acc0204cdb	drm/amdgpu: add bad page count threshold in module parameter(v3) bad_page_threshold could be configured to enable/disable the associated bad page retirement feature in RAS. When it's -1, ras will use typical bad page failure value to handle bad page retirement. When it's 0, disable bad page retirement, and no bad page will be recorded and saved. For other valid value, driver will use this manual value as the threshold value of totoal bad pages. v2: correct documentation of this parameter. v3: remove confused statement in documentation. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:25:41 -04:00
Alex Deucher	2456c290a7	Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers" This regressed some working configurations so revert it. Will fix this properly for 5.9 and backport then. This reverts commit `38e0c89a19`. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-07-30 15:36:44 -04:00
Jiansong Chen	74b3595913	drm/amdgpu: enable GFXOFF for navy_flounder Enable GFXOFF for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Liu ChengZhe	f61772cd13	drm amdgpu: Skip tmr load for SRIOV 1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Liu ChengZhe	392cf6a739	drm/amdgpu: fix PSP autoload twice in FLR Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Peilin Ye	8e32628592	drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: `c193fa91b9` ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Jiansong Chen	defa489636	drm/amdgpu: update GC golden setting for navy_flounder Update GC golden setting for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
John Clements	01eee24fce	drm/amdgpu: enable umc 8.7 functions in gmc v10 add support for umc 8.7 initialization add umc 8.7 source to makefile Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:33 -04:00
Huang Rui	35dab589de	drm/amdgpu: skip crit temperature values on APU (v2) It doesn't expose PPTable descriptor on APU platform. So max/min temperature values cannot be got from APU platform. v2: Stoney needs to skip crit temperature as well. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 14:14:07 -04:00
Alex Deucher	6cd3c6798a	drm/amdgpu/si: initial support for GPU reset Ported from radeon. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-28 09:22:57 -04:00
Mauro Rossi	64200c468f	drm/amdgpu: enable DC support for SI parts (v2) [Why] amdgpu_device.c requires changes for SI chipsets support si.c require changes for Display Manager IP block enabling [How] amdgpu_device.c: add SI families in amdgpu_device_asic_has_dc_support() si.c: changes in si_set_ip_blocks() for Display Manager IP blocks enablement (v1) NOTE: As per Kaveri and older amdgpu.dc=1 kernel cmdline is required (v2) fix for bc011f9350 ("drm/amdgpu: Change SI/CI gfx/sdma/smu init sequence") remove CHIP_HAINAN support since it does not have physical DCE6 module Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mauro Rossi <issor.oruam@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-28 09:22:48 -04:00
John Clements	48ef409c25	drm/amdgpu: add support for umc 8.7 ras functions added support for umc 8.7 error reporting and query Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:23:00 -04:00
Evan Quan	81b41ff5d2	drm/amd/powerplay: revise the outputs layout of amdgpu_pm_info debugfs The current outputs of amdgpu_pm_info debugfs come with clock gating status and followed by current clock/power information. However the clock gating status retrieving may pull GFX out of CG status. That will make the succeeding clock/power information retrieving inaccurate. To overcome this and be with minimum impact, the outputs are updated to show current clock/power information first. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:22:37 -04:00
James Zhu	1df67a4ece	Revert "drm/amdgpu/vcn3.0: remove extra asic type check" This reverts commit 058c07201ec7d373fc6a0a570b38a8a9d62c29fb. Chip NAVY_FLOUNDER uses vcn3.0, but it has only one VCN instance. Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:22:23 -04:00
Mukul Joshi	2c2b0d880f	drm/amdkfd: Add thermal throttling SMI event Add support for reporting thermal throttling events through SMI. Also, add a counter to count the number of throttling interrupts observed and report the count in the SMI event message. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:50 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Boyuan Zhang	c5079f35c0	drm/amdgpu: update dec ring test for VCN 3.0 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:16 -04:00
James Zhu	309182389e	drm/amdgpu/vcn3.0: remove extra asic type check vcn ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:05 -04:00
James Zhu	156589f74d	drm/amdgpu/jpeg3.0: remove extra asic type check jpeg ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:00 -04:00
James Zhu	8214617aaf	drm/amdgpu: Remove extra asic type check vcn ip block is already selected based on ASIC type during set_ip_blocks Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:20:35 -04:00
James Zhu	de7fe7e87a	drm/amdgpu/jpeg: Remove extra asic type check jpeg ip block is already selected based on ASIC type during set_ip_blocks. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:20:22 -04:00
Alex Deucher	ccda42a462	drm/amdgpu/powerplay: add some documentation about memory clock We expose the actual memory controller clock rate in Linux, not the effective memory clock of the DRAMs. To translate it, it follows the following formula: Clock conversion (Mhz): HBM: effective_memory_clock = memory_controller_clock * 1 G5: effective_memory_clock = memory_controller_clock * 1 G6: effective_memory_clock = memory_controller_clock * 2 DRAM data rate (MT/s): HBM: effective_memory_clock * 2 = data_rate G5: effective_memory_clock * 4 = data_rate G6: effective_memory_clock * 8 = data_rate Bandwidth (MB/s): data_rate * vram_bit_width / 8 = memory_bandwidth Some examples: G5 on RX460: memory_controller_clock = 1750 Mhz effective_memory_clock = 1750 Mhz * 1 = 1750 Mhz data rate = 1750 * 4 = 7000 MT/s memory_bandwidth = 7000 * 128 bits / 8 = 112000 MB/s G6 on RX5600: memory_controller_clock = 900 Mhz effective_memory_clock = 900 Mhz * 2 = 1800 Mhz data rate = 1800 * 8 = 14400 MT/s memory_bandwidth = 14400 * 192 bits / 8 = 345600 MB/s Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-23 10:45:16 -04:00
John Clements	c5a4ef3e20	drm/amdgpu: move umc specific macros to header certain umc macros are common across umc versions Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-23 10:45:00 -04:00
Likun Gao	8f3b800a31	drm/amdgpu: update golden setting for sienna_cichlid Update golden setting for sienna_cichlid. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-23 10:44:45 -04:00
Tom St Denis	06b668c1dc	drm/amd/amdgpu: Fix compiler warning in df driver Fix this warning: CC [M] drivers/gpu/drm/amd/amdgpu/gfxhub_v1_1.o In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h:29, from drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h:26, from drivers/gpu/drm/amd/amdgpu/amdgpu.h:43, from drivers/gpu/drm/amd/amdgpu/df_v3_6.c:23: drivers/gpu/drm/amd/amdgpu/df_v3_6.c: In function ‘df_v3_6_pmc_get_count’: ./include/drm/drm_print.h:487:2: warning: ‘hi_base_addr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 487 \| __drm_dbg(DRM_UT_DRIVER, fmt, ##__VA_ARGS__) \| ^~~~~~~~~ drivers/gpu/drm/amd/amdgpu/df_v3_6.c:649:25: note: ‘hi_base_addr’ was declared here 649 \| uint32_t lo_base_addr, hi_base_addr, lo_val = 0, hi_val = 0; \| ^~~~~~~~~~~~ In file included from drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h:29, from drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h:26, from drivers/gpu/drm/amd/amdgpu/amdgpu.h:43, from drivers/gpu/drm/amd/amdgpu/df_v3_6.c:23: ./include/drm/drm_print.h:487:2: warning: ‘lo_base_addr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 487 \| __drm_dbg(DRM_UT_DRIVER, fmt, ##__VA_ARGS__) \| ^~~~~~~~~ drivers/gpu/drm/amd/amdgpu/df_v3_6.c:649:11: note: ‘lo_base_addr’ was declared here 649 \| uint32_t lo_base_addr, hi_base_addr, lo_val = 0, hi_val = 0; Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:43:16 -04:00
Huang Rui	db92fbc3d7	drm/amdgpu: won't include gc and mmhub register headers in GMC block All gc/mmhub register access and operation should be in gfxhub/mmhub level. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:43:10 -04:00
Huang Rui	caa9f483ca	drm/amdgpu: move get_invalidate_req function into gfxhub/mmhub level This patch is to move get_invalidate_req into gfxhub/mmhub level. It will avoid mismatch of the different gfxhub/mmhub register offsets and fields in the same gmc block. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:43:04 -04:00
Huang Rui	2577db91e8	drm/amdgpu: add vmhub funcs helper (v2) This patch is to introduce vmhub funcs helper to add following callback (print_l2_protection_fault_status). Each GC/MMHUB register specific programming should be in gfxhub/mmhub level. v2: remove the condition of funcs assignment. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:56 -04:00
Huang Rui	f2c1b5c145	drm/amdgpu: abstract set_vm_fault_masks function to refine the programming This patch is to add set_vm_fault_masks helper to amdgpu_gmc to refine the original programming. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:49 -04:00
Huang Rui	5befb6fc3b	drm/amdgpu: add member to store vm fault interrupt masks This patch adds a member in vmhub structure to store the vm fault interrupt masks for different version gfxhubs/mmhubs. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:42 -04:00
Guchun Chen	b16284259f	drm/amdgpu: add printing after executing page reservation to eeprom This will tell users if the faulty page has been written to external eeprom device in dmesg log. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:23 -04:00
John Clements	4922f1bcad	drm/amdgpu: expand sienna chichlid reg access support Added dedicated 64bit reg read/write support Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:09 -04:00
Alex Deucher	a519fd83cf	drm/amdgpu: remove eeprom from the smu i2c handlers The driver uses it for EEPROM access, but it's just an i2c bus. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:49 -04:00
Alex Deucher	84dd1f698e	drm/amdgpu: move i2c bus lock out of ras structure It's not really ras related. It's just a lock for the bus in general. This removes the ras dependency from the smu i2c bus. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:40 -04:00
Paweł Gronowski	9cb268215d	drm/amdgpu: Fix NULL dereference in dpm sysfs handlers NULL dereference occurs when string that is not ended with space or newline is written to some dpm sysfs interface (for example pp_dpm_sclk). This happens because strsep replaces the tmp with NULL if the delimiter is not present in string, which is then dereferenced by tmp[0]. Reproduction example: sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk' Signed-off-by: Paweł Gronowski <me@woland.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:40 -04:00
James Zhu	4908d02637	drm/amdgpu/vcn: merge shared memory into vcpu Merge vcn firmware shared memory bo into vcn vcpu bo. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:40 -04:00
James Zhu	d10985f46e	Revert "drm/amdgpu/vcn: add shared memory restore after wake up from sleep." This reverts commit `21b704d783`. To merge vcn firmware shared memory bo into vcn vcpu bo. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:39 -04:00
Nirmoy Das	05cac1ae8f	drm/amdgpu: do not disable SMU on vm reboot For passthrough device, we do baco reset after 1st vm boot so if we disable SMU on 1st VM shutdown baco reset will fail for 2nd vm boot. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:39 -04:00
Chengming Gui	5ea6f9c22c	drm/amdgpu: add timeout flush mechanism to update wptr for self interrupt (v2) outstanding log reaches threshold will trigger IH ring1/2's wptr reported, that will avoid generating interrupts to ring0 too frequent. But if ring1/2's wptr hasn't been increased for a long time, the outstanding log can't reach threshold so that driver can't get latest wptr info and miss some interrupts. v2: squash in warning fix Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:39 -04:00
John Clements	c652923afa	drm/amdgpu: enable xgmi support for sienna cichlid set xgmi support flag suring nv ip init sequence Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-21 15:37:39 -04:00

1 2 3 4 5 ...

7762 Commits