linux

Author	SHA1	Message	Date
Yong Zhao	8bdab6bb1c	drm/amdgpu: Increase timout on emulator to tenfold instead of twice Since emulators are slower, sometime some operations like flushing tlb through FM need more than twice the regular timout of 100ms, so increase the timeout to 1s on emulators. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-26 14:21:03 -05:00
Dave Airlie	1b245ec5b6	drm-misc-next for 5.7: UAPI Changes: - lima: Add support for heap buffers Cross-subsystem Changes: Core Changes: - Implement mode_config mode_valid for memory constrained drivers - Bus format negociation between bridges - Consolidate fake vblank events for drivers without vblank interrupts - drm/bufs: dma_alloc related cleanups - drm/dp_mst: Various fixes - drm/print: New drm_device based print helpers - Thomas is a drm-misc maintainer now! Driver Changes: - DPMS cleanups for atomic drivers - Removal of owner field in SPI tinydrm drivers - Removal of explicit dependency on DT for tinydrm drivers - Conversion to YAML schemas for DT bindings - tidss: New driver - virtio: various reworks and fixes - Our usual dozen or so new panels or bridges -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkEjOgAKCRDj7w1vZxhR xeaDAQD+1MludG4RmfQhATe4jTsPC1r2x63OF2CA0ChMGHXJyQEA8qqQ+8y1Cd/u PZ3PpcTl4qYYHgzJ6FwW7kDPTvlaZQE= =IJAt -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2020-02-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.7: UAPI Changes: - lima: Add support for heap buffers Cross-subsystem Changes: Core Changes: - Implement mode_config mode_valid for memory constrained drivers - Bus format negociation between bridges - Consolidate fake vblank events for drivers without vblank interrupts - drm/bufs: dma_alloc related cleanups - drm/dp_mst: Various fixes - drm/print: New drm_device based print helpers - Thomas is a drm-misc maintainer now! Driver Changes: - DPMS cleanups for atomic drivers - Removal of owner field in SPI tinydrm drivers - Removal of explicit dependency on DT for tinydrm drivers - Conversion to YAML schemas for DT bindings - tidss: New driver - virtio: various reworks and fixes - Our usual dozen or so new panels or bridges Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200210093421.xu4sofldm6wm6xq6@gilmour.lan	2020-02-21 05:44:40 +10:00
Rajneesh Bhardwaj	9593f4d6a6	drm/amdkfd: refactor runtime pm for baco So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:54 -05:00
Christian König	c12b84d6e0	drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2 This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:25 -05:00
Christian König	ce05ac56e6	drm/amdgpu: optimize amdgpu_device_vram_access a bit. Only write the _HI register when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:17 -05:00
Jack Zhang	86b93fd62d	drm/amdgpu/sriov Don't send msg when smu suspend For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:24 -05:00
Alex Deucher	2cb44fb093	drm/amdgpu: enable GPU reset by default on renoir Everything is in place. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Alex Deucher	658c663947	drm/amdgpu: enable GPU reset by default on Navi Has been working fine for a while. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-27 16:46:45 -05:00
Chris Wilson	7e13ad8964	drm: Avoid drm_global_mutex for simple inc/dec of dev->open_count Since drm_global_mutex is a true global mutex across devices, we don't want to acquire it unless absolutely necessary. For maintaining the device local open_count, we can use atomic operations on the counter itself, except when making the transition to/from 0. Here, we tackle the easy portion of delaying acquiring the drm_global_mutex for the final release by using atomic_dec_and_mutex_lock(), leaving the global serialisation across the device opens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Thomas Hellström (VMware) <thomas_os@shipmail.org> Reviewed-by: Thomas Hellström <thellstrom@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200124130107.125404-1-chris@chris-wilson.co.uk	2020-01-24 17:41:34 +00:00
Nirmoy Das	a9d4fe2fd6	drm/amdgpu: remove unnecessary conversion to bool Better clean that up before some automation starts to complain about it Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:55:27 -05:00
chen gong	c68dbcd8f9	drm/amdgpu: add kiq version interface for RREG32/WREG32 Reading some registers by mmio will result in hang when GPU is in "gfxoff" state.This problem can be solved by GPU in "ring command packages" way. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:22 -05:00
chen gong	d33a99c4b6	drm/amdgpu: provide a generic function interface for reading/writing register by KIQ Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:14 -05:00
Pan, Xinhui	bd05221123	drm/amdgpu: add the lost mutex_init back Initialize notifier_lock. Bug: https://gitlab.freedesktop.org/drm/amd/issues/1016 Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-17 16:03:09 -05:00
Hawking Zhang	e9d4cf918f	drm/amdgpu: add arcturus to gpu recovery check code path support check if dirver should try gpu recovery for arcturus Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-16 13:40:54 -05:00
Evan Quan	9530273ec9	drm/amd/powerplay: cover the powerplay implementation details V3 This can save users much troubles. As they do not actually need to care whether swSMU or traditional powerplay routine should be used. V2: apply the fixes to vi.c and cik.c also V3: squash in oops fix Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-14 10:18:08 -05:00
Jack Zhang	895bd048fb	amd/amdgpu/sriov tdr enablement with pp_onevf_mode Under sriov and pp_onevf mode, 1.take resume instead of hw_init for smc recover to avoid potential memory leak. 2.add return condition inside smc resume function for sriov_pp_onevf_mode and pm_enabled param. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-07 11:58:19 -05:00
zhengbin	2a9b90ae47	drm/amdgpu: use true, false for bool variable in amdgpu_device.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3961:1-19: WARNING: Assignment of 0/1 to bool variable drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3981:1-19: WARNING: Assignment of 0/1 to bool variable Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: zhengbin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 15:00:01 -05:00
Ma Feng	e3c00faa7a	drm/amdgpu: Remove unneeded variable 'ret' in amdgpu_device.c Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1036:5-8: Unneeded variable: "ret". Return "0" on line 1079 Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Ma Feng <mafeng.ma@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-23 14:58:56 -05:00
Jane Jian	d83c7a07a7	drm/amdgpu: update VCN1(dual instances) fw types ID and VCN ip block type Previously there is no VCN1 type ID in psp gfx interface. Also add VCN ip block type unless the reinit after FLR for sriov would fail. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:33:26 -05:00
Andrey Grodzovsky	c96cf2823d	drm/amdgpu: Switch from system_highpri_wq to system_unbound_wq This is to avoid queueing jobs to same CPU during XGMI hive reset because there is a strict timeline for when the reset commands must reach all the GPUs in the hive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	c6a6e2db99	drm/amdgpu: Redo XGMI reset synchronization. Use task barrier in XGMI hive to synchronize ASIC resets across devices in XGMI hive. v2: Return right away with a warning if no xgmi hive, update doc. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:13 -05:00
Andrey Grodzovsky	041a62bc06	drm/amdgpu: reverts commit `ce316fa55e`. In preparation for doing XGMI reset synchronization using task barrier. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Monk Liu	5a7489a7e1	drm/amdgpu: fix KIQ ring test fail in TDR of SRIOV issues: MEC is ruined by the amdkfd_pre_reset after VF FLR done fix: amdkfd_pre_reset() would ruin MEC after hypervisor finished the VF FLR, the correct sequence is do amdkfd_pre_reset before VF FLR but there is a limitation to block this sequence: if we do pre_reset() before VF FLR, it would go KIQ way to do register access and stuck there, because KIQ probably won't work by that time (e.g. you already made GFX hang) so the best way right now is to simply remove it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Nirmoy Das	f880799d7f	amd/amdgpu: add sched array to IPs with multiple run-queues This sched array can be passed on to entity creation routine instead of manually creating such sched array on every context creation. v2: squash in missing break fix Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Nirmoy Das	0c88b43032	drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds list drm_sched_entity_init() takes drm gpu scheduler list instead of drm_sched_rq list. This makes conversion of drm_sched_rq list to drm gpu scheduler list unnecessary Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:12 -05:00
Emily Deng	8973d9ec8f	drm/amdgpu/sriov: Tonga sriov also need load firmware with smu Fix Tonga sriov load driver fail issue. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewd-by Yintian Tao <Yintian.tao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Yong Zhao	d7f72fe482	drm/amdgpu: Add CU info print log The log will be useful for easily getting the CU info on various emulation models or ASICs. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-18 16:09:06 -05:00
Simon Ser	93b09a9a89	drm/amdgpu: log when amdgpu.dc=1 but ASIC is unsupported This makes it easier to figure out whether the kernel parameter has been taken into account. Signed-off-by: Simon Ser <contact@emersion.fr> Cc: Harry Wentland <hwentlan@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:08 -05:00
Yintian Tao	c9ffa427db	drm/amd/powerplay: enable pp one vf mode for vega10 Originally, due to the restriction from PSP and SMU, VF has to send message to hypervisor driver to handle powerplay change which is complicated and redundant. Currently, SMU and PSP can support VF to directly handle powerplay change by itself. Therefore, the old code about the handshake between VF and PF to handle powerplay will be removed and VF will use new the registers below to handshake with SMU. mmMP1_SMN_C2PMSG_101: register to handle SMU message mmMP1_SMN_C2PMSG_102: register to handle SMU parameter mmMP1_SMN_C2PMSG_103: register to handle SMU response v2: remove module parameter pp_one_vf v3: fix the parens v4: forbid vf to change smu feature v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute v6: change skip condition at vega10_copy_table_to_smc Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Le Ma	00eaa57172	drm/amdgpu: clear err_event_athub flag after reset exit Otherwise next err_event_athub error cannot call gpu reset. And following resume sequence will not be affected by this flag. v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:26:11 -05:00
Le Ma	b823821f22	drm/amdgpu: support full gpu reset workflow when ras err_event_athub occurs This athub fatal error can be recovered by baco without system-level reboot, so add a mode to use baco for the recovery. Not affect the default psp reset situations for now. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:26:03 -05:00
Le Ma	ce316fa55e	drm/amdgpu: add concurrent baco reset support for XGMI Currently each XGMI node reset wq does not run in parrallel if bound to same cpu. Make change to bound the xgmi_reset_work item to different cpus. XGMI requires all nodes enter into baco within very close proximity before any node exit baco. So schedule the xgmi_reset_work wq twice for enter/exit baco respectively. To use baco for XGMI, PMFW supported for baco on XGMI needs to be involved. The case that PSP reset and baco reset coexist within an XGMI hive never exist and is not in the consideration. v2: define use_baco flag to simplify the code for xgmi baco sequence Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:53 -05:00
Le Ma	7a22677b95	drm/amdgpu: enable/disable doorbell interrupt in baco entry/exit helper This operation is needed when baco entry/exit for ras recovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-05 16:25:45 -05:00
Emily Deng	0ea203a912	drm/amdgpu/sriov: No need the event 3 and 4 now As will call unload kms when initialize fail, and the unload kms will send event 3 and 4, so don't need event 3 and 4 in device init. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:28 -05:00
Yintian Tao	7c868b592d	drm/amdgpu: not remove sysfs if not create sysfs When load amdgpu failed before create pm_sysfs and ucode_sysfs, the pm_sysfs and ucode_sysfs should not be removed. Otherwise, there will be warning call trace just like below. [ 24.836386] [drm] VCE initialized successfully. [ 24.841352] amdgpu 0000:00:07.0: amdgpu_device_ip_init failed [ 25.370383] amdgpu 0000:00:07.0: Fatal error during GPU init [ 25.889575] [drm] amdgpu: finishing device. [ 26.069128] amdgpu 0000:00:07.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring kiq_2.1.0 test failed (-110) [ 26.070110] [drm:gfx_v9_0_hw_fini [amdgpu]] ERROR KCQ disable failed [ 26.200309] [TTM] Finalizing pool allocator [ 26.200314] [TTM] Finalizing DMA pool allocator [ 26.200349] [TTM] Zone kernel: Used memory at exit: 0 KiB [ 26.200351] [TTM] Zone dma32: Used memory at exit: 0 KiB [ 26.200353] [drm] amdgpu: ttm finalized [ 26.205329] ------------[ cut here ]------------ [ 26.205330] sysfs group 'fw_version' not found for kobject '0000:00:07.0' [ 26.205347] WARNING: CPU: 0 PID: 1228 at fs/sysfs/group.c:256 sysfs_remove_group+0x80/0x90 [ 26.205348] Modules linked in: amdgpu(OE+) gpu_sched(OE) ttm(OE) drm_kms_helper(OE) drm(OE) i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt rpcsec_gss_krb5 auth_rpcgss nfsv4 nfs lockd grace fscache binfmt_misc snd_hda_codec_generic ledtrig_audio crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hda_core snd_hwdep snd_pcm snd_timer input_leds snd joydev soundcore serio_raw pcspkr evbug aesni_intel aes_x86_64 crypto_simd cryptd mac_hid glue_helper sunrpc ip_tables x_tables autofs4 8139too psmouse 8139cp mii i2c_piix4 pata_acpi floppy [ 26.205369] CPU: 0 PID: 1228 Comm: modprobe Tainted: G OE 5.2.0-rc1 #1 [ 26.205370] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [ 26.205372] RIP: 0010:sysfs_remove_group+0x80/0x90 [ 26.205374] Code: e8 35 b9 ff ff 5b 41 5c 41 5d 5d c3 48 89 df e8 f6 b5 ff ff eb c6 49 8b 55 00 49 8b 34 24 48 c7 c7 48 7a 70 98 e8 60 63 d3 ff <0f> 0b eb d7 66 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 [ 26.205375] RSP: 0018:ffffbee242b0b908 EFLAGS: 00010282 [ 26.205376] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006 [ 26.205377] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff97ad6f817380 [ 26.205377] RBP: ffffbee242b0b920 R08: ffffffff98f520c4 R09: 00000000000002b3 [ 26.205378] R10: ffffbee242b0b8f8 R11: 00000000000002b3 R12: ffffffffc0e58240 [ 26.205379] R13: ffff97ad6d1fe0b0 R14: ffff97ad4db954c8 R15: ffff97ad4db7fff0 [ 26.205380] FS: 00007ff3d8a1c4c0(0000) GS:ffff97ad6f800000(0000) knlGS:0000000000000000 [ 26.205381] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.205381] CR2: 00007f9b2ef1df04 CR3: 000000042aab8001 CR4: 00000000003606f0 [ 26.205384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 26.205385] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 26.205385] Call Trace: [ 26.205461] amdgpu_ucode_sysfs_fini+0x18/0x20 [amdgpu] [ 26.205518] amdgpu_device_fini+0x3b4/0x560 [amdgpu] [ 26.205573] amdgpu_driver_unload_kms+0x4f/0xa0 [amdgpu] [ 26.205623] amdgpu_driver_load_kms+0xcd/0x250 [amdgpu] [ 26.205637] drm_dev_register+0x12b/0x1c0 [drm] [ 26.205695] amdgpu_pci_probe+0x12a/0x1e0 [amdgpu] [ 26.205699] local_pci_probe+0x47/0xa0 [ 26.205701] pci_device_probe+0x106/0x1b0 [ 26.205704] really_probe+0x21a/0x3f0 [ 26.205706] driver_probe_device+0x11c/0x140 [ 26.205707] device_driver_attach+0x58/0x60 [ 26.205709] __driver_attach+0xc3/0x140 Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-02 17:39:05 -05:00
Alex Deucher	de185019bc	drm/amdgpu: move pci handling out of pm ops The documentation says the that PCI core handles this for you unless you choose to implement it. Just rely on the PCI core to handle the pci specific bits. Reviewed-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-26 14:51:03 -05:00
Alex Deucher	3840c5bcc2	drm/amdgpu: disentangle runtime pm and vga_switcheroo Originally we only supported runtime pm on PX/HG laptops so vga_switcheroo and runtime pm are sort of entangled. Attempt to logically separate them. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:53 -05:00
Alex Deucher	361dbd01a1	drm/amdgpu: add helpers for baco entry and exit BACO - Bus Active, Chip Off Will be used for runtime pm. Entry will enter the BACO state (chip off). Exit will exit the BACO state (chip on). Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:52 -05:00
Alex Deucher	31af062acf	drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2) BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off To better match what we are checking for and to align with amdgpu_device_supports_baco. BOCO is used on PowerXpress/Hybrid Graphics systems and BACO is used on desktop dGPU boards. v2: fix typo in documentation Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:50 -05:00
Alex Deucher	a69cba42b1	drm/amdgpu: add a amdgpu_device_supports_baco helper BACO - Bus Active, Chip Off To check if a device supports BACO or not. This will be used in determining when to enable runtime pm. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:49 -05:00
Yintian Tao	d0d13fe874	drm/amdgpu: put flush_delayed_work at first There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] ERROR IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] ERROR ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:51 -05:00
Leo Liu	52f2e779ad	drm/amdgpu: add driver support for JPEG2.0 and above By using JPEG IP block type Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
yu kuai	b8b7213057	drm/amdgpu: add function parameter description in 'amdgpu_device_set_cg_state' Fixes gcc warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1954: warning: Function parameter or member 'state' not described in 'amdgpu_device_set_cg_state' Fixes: `e3ecdffac9` ("drm/amdgpu: add documentation for amdgpu_device.c") Signed-off-by: yu kuai <yukuai3@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha	b86a1aa36a	drm/amd/display: rename DCN1_0 kconfig to DCN Since dcn20 and dcn21 are under dcn1 it doesnt make sense to have it named dcn1. Change it to "dcn" to make it generic Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha	aca935c7cc	drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_1 flag [Why] DCN21 is stable enough to be build by default. So drop the flags. [How] Remove them using the unifdef tool. The following commands were executed in sequence: $ find -name '.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';' $ find -name '.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DCN2_1 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_1 '{}' ';' In addition: * Remove from kconfig, and replace any dependencies with DCN1_0. * Remove from any makefiles. * Fix and cleanup Renoir definitions in dal_asic_id.h * Expand DCN1 ifdef to include DCN21 code in the following files: * clk_mgr/clk_mgr.c: dc_clk_mgr_create() * core/dc_resources.c: dc_create_resource_pool() * gpio/hw_factory.c: dal_hw_factory_init() * gpio/hw_translate.c: dal_hw_translate_init() Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:44 -05:00
Bhawanpreet Lakha	1da37801a8	drm/amd/display: Drop CONFIG_DRM_AMD_DC_DCN2_0 and DSC_SUPPORTED [Why] DCN2 and DSC are stable enough to be build by default. So drop the flags. [How] Remove them using the unifdef tool. The following commands were executed in sequence: $ find -name '.c' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';' $ find -name '.h' -exec unifdef -m -DCONFIG_DRM_AMD_DC_DSC_SUPPORT -DCONFIG_DRM_AMD_DC_DCN2_0 -UCONFIG_TRIM_DRM_AMD_DC_DCN2_0 '{}' ';' In addition: * Remove from kconfig, and replace any dependencies with DCN1_0. * Remove from any makefiles. * Fix and cleanup NV defninitions in dal_asic_id.h * Expand DCN1 ifdef to include DCN2 code in the following files: * clk_mgr/clk_mgr.c: dc_clk_mgr_create() * core/dc_resources.c: dc_create_resource_pool() * dce/dce_dmcu.c: dcn20_lock_phy() dce/dce_dmcu.c: dcn20_funcs * dce/dce_dmcu.c: dcn20_dmcu_create() * gpio/hw_factory.c: dal_hw_factory_init() * gpio/hw_translate.c: dal_hw_translate_init() Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:44 -05:00
Jesse Zhang	9f87516764	drm/amd/amdgpu: finish delay works before release resources flush/cancel delayed works before doing finalization to avoid concurrently requests. Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-11 17:38:13 -05:00
Evan Quan	60599a0363	drm/amdgpu: perform p-state switch after the whole hive initialized P-state switch should be performed after all devices from the hive get initialized. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-06 16:27:48 -05:00
Evan Quan	b0adca4d50	drm/amdgpu: register gpu instance before fan boost feature enablment Otherwise, the feature enablement will be skipped due to wrong count. Fixes: `beff74bc6e` ("drm/amdgpu: fix a race in GPU reset with IB test (v2)") Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-06 16:27:47 -05:00
Evan Quan	39ea6e5f9e	drm/amdgpu: change pstate only after all XGMI device initialized Pstate settings should be performed after all device of the XGMI setup get initialized. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-06 16:27:46 -05:00
Le Ma	bff77e86a3	drm/amdgpu: bypass some cleanup work after err_event_athub (v2) PSP lost connection when err_event_athub occurs. These cleanup work can be skipped in BACO reset. v2: squash in missing include (Alex) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-30 11:06:52 -04:00
Wambui Karuga	f440ff44b1	drm/amd: correct "_LENTH" mispelling in constant Correct the "_LENTH" mispelling in the AMDGPU_MAX_TIMEOUT_PARAM_LENGTH constant. Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-28 11:19:17 -04:00
Andrey Grodzovsky	121a2bc6ae	drm/amdgpu: Move amdgpu_ras_recovery_init to after SMU ready. For Arcturus the I2C traffic is done through SMU tables and so we must postpone RAS recovery init to after they are ready which is in amdgpu_device_ip_hw_init_phase2. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-25 16:50:10 -04:00
Tianci.Yin	e35e2b117f	drm/amdgpu: add a generic fb accessing helper function(v3) add a generic helper function for accessing framebuffer via MMIO Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-17 16:31:13 -04:00
Alex Deucher	897483d8a0	drm/amdgpu: move gpu reset out of amdgpu_device_suspend Move it into the caller. There are cases were we don't want it. We need it for hibernation, but we don't need it for runtime pm, so drop it for runtime pm. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-15 15:51:39 -04:00
Alex Deucher	803cc26d5c	drm/amdgpu: move pci_save_state into suspend path for amdgpu_device_suspend. This follows the logic in the resume path. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-15 15:51:39 -04:00
Emily Deng	bcccee89f4	drm/amdgpu: Fix tdr3 could hang with slow compute issue When index is 1, need to set compute ring timeout for sriov and passthrough. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-15 15:50:23 -04:00
Alex Deucher	71f98027f2	drm/amdgpu: move amdgpu_device_get_job_timeout_settings It's only used in amdgpu_device.c and the naming also reflects that. Move it there. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-07 15:10:20 -05:00
Lyude Paul	f8d2d39eb4	drm/amdgpu: Iterate through DRM connectors correctly Currently, every single piece of code in amdgpu that loops through connectors does it incorrectly and doesn't use the proper list iteration helpers, drm_connector_list_iter_begin() and drm_connector_list_iter_end(). Yeesh. So, do that. Cc: Juston Li <juston.li@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Harry Wentland <hwentlan@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:05 -05:00
Jesse Zhang	df99ac0fcc	drm/amd/amdgpu:Fix compute ring unable to detect hang. When compute fence did not signal, compute ring cannot detect hardware hang because its timeout value is set to be infinite by default. In SR-IOV and passthrough mode, if user does not declare custome timeout value for compute ring, then use gfx ring timeout value as default. So that when there is a ture hardware hang, compute ring can detect it. Signed-off-by: Jesse Zhang <zhexi.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Xiaojie Yuan	ec51d3facd	drm/amdgpu/discovery: get gpu info from ip discovery table except soc_bounding_box which is not integrated in discovery table yet Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:10:59 -05:00
Evan Quan	0e0b89c0d7	drm/amd/powerplay: properly set mp1 state for SW SMU suspend/reset routine Set mp1 state properly for SW SMU suspend/reset routine. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:10:58 -05:00
Shirish S	a35ad98bf9	drm/amdgpu: remove needless usage of #ifdef define sched_policy in case CONFIG_HSA_AMD is not enabled, with this there is no need to check for CONFIG_HSA_AMD else where in driver code. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:05:20 -05:00
Shirish S	8c9f69bc5c	drm/amdgpu: fix build error without CONFIG_HSA_AMD If CONFIG_HSA_AMD is not set, build fails: drivers/gpu/drm/amd/amdgpu/amdgpu_device.o: In function `amdgpu_device_ip_early_init': drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:1626: undefined reference to `sched_policy' Use CONFIG_HSA_AMD to guard this. Fixes: 1abb680ad371 ("drm/amdgpu: disable gfxoff while use no H/W scheduling policy") Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:05:07 -05:00
Huang Rui	aa978594cf	drm/amdgpu: disable gfxoff while use no H/W scheduling policy While gfxoff is enabled, the mmVM_XXX registers will be 0xfffffff while the GFX is in "off" state. KFD queue creattion doesn't use ring based method, so it will trigger a VM fault. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:55:41 -05:00
Yong Zhao	4e66d7d215	drm/amdgpu: Add a kernel parameter for specifying the asic type As more and more new asics start to reuse the old device IDs before launch, there is a need to quickly override the existing asic type corresponding to the reused device ID through a kernel parameter. With this, engineers no longer need to rely on local hack patches, facilitating cooperation across teams. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 09:54:25 -05:00
Tao Zhou	1a6fc071e1	drm/amdgpu: move the call of ras recovery_init and bad page reserve to proper place ras recovery_init should be called after ttm init, bad page reserve should be put in front of gpu reset since i2c may be unstable during gpu reset. add cleanup for recovery_init and recovery_fini v2: add more comment and print. remove cancel_work_sync in recovery_init. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:50:47 -05:00
Yong Zhao	050091ab6e	drm/amdkfd: Query kfd device info by CHIP id instead of pci device id This optimizes out the pci device id usage in KFD and makes the code more maintainable. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:49:45 -05:00
Andrey Grodzovsky	d5ea093eeb	dmr/amdgpu: Add system auto reboot to RAS. In case of RAS error allow user configure auto system reboot through ras_ctrl. This is also part of the temproray work around for the RAS hang problem. v4: Use latest kernel API for disk sync. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:17 -05:00
Andrey Grodzovsky	7c6e68c777	drm/amdgpu: Avoid HW GPU reset for RAS. Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable error happens the DF will unconditionally broadcast error event packets to all its clients/slave upon receiving fatal error event and freeze all its outbound queues, err_event_athub interrupt will be triggered. In such case and we use this interrupt to issue GPU reset. THe GPU reset code is modified for such case to avoid HW reset, only stops schedulers, deatches all in progress and not yet scheduled job's fences, set error code on them and signals. Also reject any new incoming job submissions from user space. All this is done to notify the applications of the problem. v2: Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c Remove print param from amdgpu_ras_query_error_count v3: Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset for other XGMI hive memebers. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:05 -05:00
Andrey Grodzovsky	12ffa55da6	drm/amdgpu: Fix bugs in amdgpu_device_gpu_recover in XGMI case. Issue 1: In XGMI case amdgpu_device_lock_adev for other devices in hive was called to late, after access to their repsective schedulers. So relocate the lock to the begining of accessing the other devs. Issue 2: Using amdgpu_device_ip_need_full_reset to switch the device list from all devices in hive to the single 'master' device who owns this reset call is wrong because when stopping schedulers we iterate all the devices in hive but when restarting we will only reactivate the 'master' device. Also, in case amdgpu_device_pre_asic_reset conlcudes that full reset IS needed we then have to stop schedulers for all devices in hive and not only the 'master' but with amdgpu_device_ip_need_full_reset we already missed the opprotunity do to so. So just remove this logic and always stop and start all schedulers for all devices in hive. Also minor cleanup and print fix. v4: Minor coding style fix. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:39:05 -05:00
Andrey Grodzovsky	0b2d2c2eec	drm/amdgpu: Handle job is NULL use case in amdgpu_device_gpu_recover This should be checked at all places job is accessed. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-30 15:02:39 -05:00
Roman Li	e1c14c4339	drm/amdgpu: Enable DC on Renoir Enable DC support for renoir. Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:34 -05:00
Monk Liu	e352625796	drm/amdgpu: introduce vram lost for reset (v2) for SOC15/vega10 the BACO reset & mode1 would introduce vram lost in high end address range, current kmd's vram lost checking cannot catch it since it only check very ahead visible frame buffer v2: cover NV as well Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-29 15:52:32 -05:00
Andrey Grodzovsky	c43b849f89	drm/amdgpu: Use new mode2 reset interface for RV. Integrate the mode2 reset into rest sequence. v2: Check ppfuncs pointer for NULL Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-15 11:00:44 -05:00
Huang Rui	b51a26a02a	drm/amdgpu: add renoir support for gpu_info and ip block setting This patch adds renoir support for gpu_info firmware and ip block setting. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-12 12:47:49 -05:00
Huang Rui	1eee4228a5	drm/amdgpu: add renoir asic_type enum This patch adds renoir to amd_asic_type enum and amdgpu_asic_name[]. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-12 12:47:49 -05:00
Tao Zhou	6ca523d7eb	drm/amdgpu: remove RREG64/WREG64 atomic 64 bits REG operations are useless currently Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-09 11:17:30 -05:00
Tao Zhou	c6dddf4540	drm/amdgpu: replace readq/writeq with atomic64 operations what we really want is a read or write that is guaranteed to be 64 bits at a time, atomic64 operations are supported on all architectures Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-09 11:14:11 -05:00
Andrey Grodzovsky	b5507c7e06	drm/amdgpu: Fix GPU reset crash regression. amdgpu_ip_block.status.hw for GMC wasn't set to false on suspend during GPU reset and so on resume gmc_v9_0_resume wasn't called. Caused by 'drm/amdgpu: fix double ucode load by PSP(v3)' Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-06 13:53:19 -05:00
xinhui pan	876923fb92	drm/amdgpu: Fix panic during gpu reset Clear the flag after hw suspend, otherwise it skips the corresponding hw resume. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-06 13:52:37 -05:00
Leo Li	078655d982	drm/amdgpu: Add nv12 DC ip block Load DC and amdgpu display manager Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:30:41 -05:00
Xiaojie Yuan	4808cf9c2a	drm/amdgpu: set asic family and ip blocks for navi12 same with navi10 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:30:39 -05:00
Xiaojie Yuan	42b325e5ec	drm/amdgpu: add gpu_info firmware for navi12 gpu_info firmare store asic configuration details. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:30:39 -05:00
Xiaojie Yuan	9802f5d78b	drm/amdgpu: add navi12 asic type Add asic type. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:30:39 -05:00
Monk Liu	482f0e5385	drm/amdgpu: fix double ucode load by PSP(v3) previously the ucode loading of PSP was repreated, one executed in phase_1 init/re-init/resume and the other in fw_loading routine Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset prior to the FW loading and any block's hw_init/resume v2: still do the smu fw loading since it is needed by bare-metal v3: drop the change in reinit_early_sriov, just clear all block's status.hw in the head place and set the status.hw after hw_init done is enough Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:17:34 -05:00
Monk Liu	4cd4c5c064	drm/amdgpu: cleanup vega10 SRIOV code path we can simplify all those unnecessary function under SRIOV for vega10 since: 1) PSP L1 policy is by force enabled in SRIOV 2) original logic always set all flags which make itself a dummy step besides, 1) the ih_doorbell_range set should also be skipped for VEGA10 SRIOV. 2) the gfx_common registers should also be skipped for VEGA10 SRIOV. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:17:21 -05:00
Tao Zhou	4fa1c6a679	drm/amdgpu: add RREG64/WREG64(_PCIE) operations add 64 bits register access functions v2: implement 64 bit functions in low level Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Dennis Li <dennis.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-31 14:49:40 -05:00
Alex Deucher	a3a09142f4	drm/amdgpu: put the SMC into the proper state on reset/unload When doing a GPU reset or unloading the driver, we need to put the SMU into the apprpriate state for the re-init after the reset or unload to reliably work. I don't think this is necessary for BACO because the SMU actually controls the BACO state to it needs to be active. For suspend (S3), the asic is put into D3 so the SMU would be powered down so I don't think we need to put the SMU into any special state. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-30 23:24:44 -05:00
Le Ma	65e60f6e06	drm/amdgpu: add Arcturus gpu info firmware Add GPU info firmware for Arcturus. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:03 -05:00
Le Ma	61cf44c1db	drm/amdgpu: add to set Arcturus ip blocks Add IP blocks for Arcturus. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:02 -05:00
Le Ma	d6c3b24ea2	drm/amdgpu: add Arcturus asic type Add asic type for Arcturus. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:01 -05:00
Bhawanpreet Lakha	8fceceb69e	drm/amd/display: add dm block enable DC for navi14. Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:00 -05:00
Xiaojie Yuan	7ecb5cd451	drm/amdgpu: set asic family and ip blocks for navi14 same with navi10 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:17:58 -05:00
Xiaojie Yuan	ed42cfe1ac	drm/amdgpu: add gpu_info firmware for navi14 Add navi14 to case statement to load the GPU info firmware. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:17:58 -05:00
Xiaojie Yuan	87dbad02d2	drm/amdgpu: add navi14 asic type Add CHIP_NAVI14 to the list of asic types. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:17:58 -05:00
Alex Deucher	32eaeae0ef	drm/amdgpu/psp: add a mutex to protect access to the psp ring We need to serialize access to the psp ring if there are multiple callers at runtime. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:39 -05:00
Alex Deucher	f54eeab4e7	drm/amdgpu: properly guard the generic discovery code It's only available on navi and newer. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-09 17:43:31 -05:00
Arnd Bergmann	d155bef063	amdgpu: make pmu support optional When CONFIG_PERF_EVENTS is disabled, we cannot compile the pmu portion of the amdgpu driver: drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:48:38: error: no member named 'hw' in 'struct perf_event' struct hw_perf_event *hwc = &event->hw; ~~~~~ ^ drivers/gpu/drm/amd/amdgpu/amdgpu_pmu.c:51:13: error: no member named 'attr' in 'struct perf_event' if (event->attr.type != event->pmu->type) ~~~~~ ^ ... Use conditional compilation for this file. Fixes: `9c7c85f7ea` ("drm/amdgpu: add pmu counters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-08 13:56:22 -05:00
xinhui pan	f1c1314be4	drm/amdgpu: Disable ras features on all IPs before gpu reset Perform a ras_suspend to disable ras on all IPs to workaround some ROCm stability issue. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-05 15:59:20 -05:00
Jack Xiao	b2109d8ed6	drm/amdgpu: enable PCIE atomics ops support GPU atomics operation depends on PCIE atomics support. Always enable PCIE atomics ops support in case that it hasn't been enabled. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-01 14:54:40 -05:00
Evan Quan	fdafb3597a	drm/amdgpu: fix MGPU fan boost enablement for XGMI reset MGPU fan boost feature should not be enabled until all the devices from the same hive are all back from reset. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-01 14:54:12 -05:00
Alex Deucher	d7929c1e13	Merge branch 'drm-next' into drm-next-5.3 Backmerge drm-next and fix up conflicts due to drmP.h removal. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-25 08:42:25 -05:00
Harry Wentland	b4f199c7b0	drm/amdgpu: Enable DC support for Navi10 Enable the IP for navi10. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-22 09:34:07 -05:00
Harry Wentland	48321c3dde	drm/amd/display: Read soc_bounding_box from gpu_info (v2) [WHY] We don't want to expose sensitive ASIC information before ASIC release. [HOW] Encode the soc_bounding_box in the gpu_info FW (for Linux) and read it at driver load. v2: fix warning when CONFIG_DRM_AMD_DC_DCN2_0 is not set (Alex) Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:59:34 -05:00
Huang Rui	0a5b8c7b94	drm/amdgpu: add to set navi ip blocks Set the IPs for navi10 in early_init like other asics. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:59:24 -05:00
Hawking Zhang	e0d076574e	drm/amdgpu: update golden setting programming logic Since from soc15, make sure only AndMasked bit get changed when applied or_mask Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:59:24 -05:00
Jack Xiao	5f84cc635b	drm/amdgpu/mes: enable mes on navi10 and later asic When amdgpu_mes is enabled and asic family is navi10 and later asic, enable mes per device. Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:58:22 -05:00
Xiaojie Yuan	a190d1c75c	drm/amdgpu/discovery: add module param for ip discovery enablement to control enablement. Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:58:21 -05:00
Jack Xiao	6698a3d05f	drm/amdgpu: add mcbp unit test in debugfs (v3) The MCBP unit test is used to test the functionality of MCBP. It emualtes to send preemption request and resubmit the unfinished jobs. v2: squash in fixes (Alex) v3: squash in memory leak fix (Jack) Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:58:21 -05:00
Jack Xiao	f92d5c6123	drm/amdgpu: enable the static csa when mcbp enabled CSA is the Context Save Area for preemption. Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 09:36:54 -05:00
Jack Xiao	b239c01727	drm/amdgpu: add mcbp driver parameter Add mcbp driver parameter, so that mcbp feature can be enabled/disabled by driver parameter. Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 09:30:52 -05:00
Hawking Zhang	35c2e91059	drm/amdgpu: parse the new members added by gpu_info ucode v1_1 Parse the new parameters for gfx10. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-20 21:16:16 -05:00
Huang Rui	23c6268eb1	drm/amdgpu: add navi10 gpu info firmware gpu info firmware stores configuration data for various IP blocks. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-20 21:15:29 -05:00
Huang Rui	852a6626d5	drm/amdgpu: add navi10 asic type Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-20 15:54:56 -05:00
Jonathan Kim	9c7c85f7ea	drm/amdgpu: add pmu counters adding perf event counters Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-20 11:36:22 -05:00
Daniel Vetter	52d2d44eee	Linux 5.2-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Gj1MeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGctkH/0At3+SQPY2JJSy8 i6+TDeytFx9OggeGLPHChRfehkAlvMb/kd34QHnuEvDqUuCAMU6HZQJFKoK9mvFI sDJVayPGDSqpm+iv8qLpMBPShiCXYVnGZeVfOdv36jUswL0k6wHV1pz4avFkDeZa 1F4pmI6O2XRkNTYQawbUaFkAngWUCBG9ECLnHJnuIY6ohShBvjI4+E2JUaht+8gO M2h2b9ieddWmjxV3LTKgsK1v+347RljxdZTWnJ62SCDSEVZvsgSA9W2wnebVhBkJ drSmrFLxNiM+W45mkbUFmQixRSmjv++oRR096fxAnodBxMw0TDxE1RiMQWE6rVvG N6MC6xA= =+B0P -----END PGP SIGNATURE----- Merge v5.2-rc5 into drm-next Maarten needs -rc4 backmerged so he can pull in the fbcon notifier removal topic branch into drm-misc-next. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2019-06-19 12:07:29 +02:00
Alex Deucher	21a249ca02	drm/amdgpu: wait to fetch the vbios until after common init We need the asic_funcs set for the get rom callbacks in some cases. Tested-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-17 11:02:03 -05:00
Daniel Vetter	2454fcea33	drm-misc-next for v5.3: UAPI Changes: Cross-subsystem Changes: - Add code to signal all dma-fences when freed with pending signals. - Annotate reservation object access in CONFIG_DEBUG_MUTEXES Core Changes: - Assorted documentation fixes. - Use irqsave/restore spinlock to add crc entry. - Move code around to drm_client, for internal modeset clients. - Make drm_crtc.h and drm_debugfs.h self-contained. - Remove drm_fb_helper_connector. - Add bootsplash to todo. - Fix lock ordering in pan_display_legacy. - Support pinning buffers to current location in gem-vram. - Remove the now unused locking functions from gem-vram. - Remove the now unused kmap-object argument from vram helpers. - Stop checking return value of debugfs_create. - Add atomic encoder enable/disable helpers. - pass drm_atomic_state to atomic connector check. - Add atomic support for bridge enable/disable. - Add self refresh helpers to core. Driver Changes: - Add extra delay to make MTP SDM845 work. - Small fixes to virtio, vkms, sii902x, sii9234, ast, mcde, analogix, rockchip. - Add zpos and ?BGR8888 support to meson. - More removals of drm_os_linux and drmP headers for amd, radeon, sti, r128, r128, savage, sis. - Allow synopsis to unwedge the i2c hdmi bus. - Add orientation quirks for GPD panels. - Edid cleanups and fixing handling for edid < 1.2. - Add runtime pm to stm. - Handle s/r in dw-hdmi. - Add hooks for power on/off to dsi for stm. - Remove virtio dirty tracking code, done in drm core. - Rework BO handling in ast and mgag200. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl0DYU8ACgkQ/lWMcqZw E8NNWw/+MhcRakQmrNDMRIj4DvukzPW2efXbhRFuvthUvVN7rOHMzQZBc3le+gUb 2GGoEeUYG7XoA/Nj3ZQMUoalrjODywtLClBClC4Blped0mZ4JPiI7bTsrNILn1N1 hZ0+DbffMCAKqKN8TftK/TrFF9IEM8JSftqD/1RdkiXVcMH3NKuLABHZxzPxH2BH XuSqIL5lDyAtanixB53aDf2gw9iipUphYoFlKhdx9dr5Ql96RhiOcDgFhXnFiQu4 O9z3W6tRI2VPoCzsnhNy3Eah7rBDnZwvyfGa9YU/Q+VeHegb601p8OmNNwpshWE1 ohixBbADj0dr+K3T/lVW30kovg34i4L5K3O7MR0HxWYSA7+v3AHyG7/GWLxbBNQn AFHTRbBph8aP/Dn24ucbKaB7wHi31j7b0Hxj+oJR8RoGhuOYcMOuZrCHqpAxStto riSVDCRcq/KcPiuqZZ1UnzFWlQMhNFUwumloPiXFkJ4mcSdK9IbdKBd2eqbRdaU1 eTOA4istVgNgaNbgLvVB2ltjqXrsdio7/jh6RhobFPqHISiL7iMZg3C/KRBXrkUB lYMeGkiE3Wp77zdycdofuEbMfAYUwLts8EYjVsM6xo0BKlBYhpeVuBOYeQEkU7PV PpGYqQVeZUoD1OyGlMWIYoyb5Ya7OLUDpooOJdFqoPzUfDki31E= =4uQX -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2019-06-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.3: UAPI Changes: Cross-subsystem Changes: - Add code to signal all dma-fences when freed with pending signals. - Annotate reservation object access in CONFIG_DEBUG_MUTEXES Core Changes: - Assorted documentation fixes. - Use irqsave/restore spinlock to add crc entry. - Move code around to drm_client, for internal modeset clients. - Make drm_crtc.h and drm_debugfs.h self-contained. - Remove drm_fb_helper_connector. - Add bootsplash to todo. - Fix lock ordering in pan_display_legacy. - Support pinning buffers to current location in gem-vram. - Remove the now unused locking functions from gem-vram. - Remove the now unused kmap-object argument from vram helpers. - Stop checking return value of debugfs_create. - Add atomic encoder enable/disable helpers. - pass drm_atomic_state to atomic connector check. - Add atomic support for bridge enable/disable. - Add self refresh helpers to core. Driver Changes: - Add extra delay to make MTP SDM845 work. - Small fixes to virtio, vkms, sii902x, sii9234, ast, mcde, analogix, rockchip. - Add zpos and ?BGR8888 support to meson. - More removals of drm_os_linux and drmP headers for amd, radeon, sti, r128, r128, savage, sis. - Allow synopsis to unwedge the i2c hdmi bus. - Add orientation quirks for GPD panels. - Edid cleanups and fixing handling for edid < 1.2. - Add runtime pm to stm. - Handle s/r in dw-hdmi. - Add hooks for power on/off to dsi for stm. - Remove virtio dirty tracking code, done in drm core. - Rework BO handling in ast and mgag200. Tiny conflict in drivers/gpu/drm/amd/display/dc/clk_mgr/clk_mgr.c, needed #include <linux/slab.h> to make it compile. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/0e01de30-9797-853c-732f-4a5bd6e61445@linux.intel.com	2019-06-14 11:44:24 +02:00
Yintian Tao	e9bc1bf791	drm/amdgpu: register pm sysfs for sriov (v2) we need register pm sysfs for virt in order to support dpm level modification because smu ip block will not be added under SRIOV v2: whitespace fixes (Alex) Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-13 13:59:48 -05:00
Tom St Denis	b4559a1646	drm/amd/amdgpu: remove vram_page_split kernel option (v3) This option is no longer needed. The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps v3: Misread 512 pages as MiB ... Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:44:23 -05:00
Prike Liang	80f41f84ae	drm/amd/amdgpu: add RLC firmware to support raven1 refresh Use SMU firmware version to indentify the raven1 refresh device and then load homologous RLC FW. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Huang Rui<Ray.Huang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:40:06 -05:00
Sam Ravnborg	fdf2f6c56e	drm/amd: drop use of drmP.h in amdgpu/amdgpu* Drop use of drmP.h in all files named amdgpu* in drm/amd/amdgpu/ Fix fallout. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-10-sam@ravnborg.org	2019-06-10 23:02:48 +02:00
Alex Deucher	beff74bc6e	drm/amdgpu: fix a race in GPU reset with IB test (v2) Split late_init into two functions, one (do_late_init) which just does the hw init, and late_init which calls do_late_init and schedules the IB test work. Call do_late_init in the GPU reset code to run the init code, but not schedule the IB test code. The IB test code is called directly in the gpu reset code so no need to run the IB tests in a separate work thread. If we do, we end up racing. v2: Rework late_init. Pull out the mgpu fan boost and xgmi pstate code into late_init so they get called in all cases. rename the late_init worker thread to delayed work since it's just the IB tests now which can happen later. Schedule the work at init and resume time. It's not needed at reset time because the IB tests are called directly. Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Xinhui Pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 22:18:09 -05:00
xinhui pan	c53e4db712	drm/amdgpu: cancel late_init_work before gpu reset gpu reset will run late_init and schedule the late_init_work. if we keep triggering gpu reset in a short time, there are potenial races. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 22:18:09 -05:00
Prike Liang	1929059893	drm/amd/amdgpu: add RLC firmware to support raven1 refresh Use SMU firmware version to indentify the raven1 refresh device and then load homologous RLC FW. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Huang Rui<Ray.Huang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 11:14:15 -05:00
Dave Airlie	91c1ead6ae	Merge branch 'drm-next-5.3' of git://people.freedesktop.org/~agd5f/linux into drm-next New stuff for 5.3: - Add new thermal sensors for vega asics - Various RAS fixes - Add sysfs interface for memory interface utilization - Use HMM rather than mmu notifier for user pages - Expose xgmi topology via kfd - SR-IOV fixes - Fixes for manual driver reload - Add unique identifier for vega asics - Clean up user fence handling with UVD/VCE/VCN blocks - Convert DC to use core bpc attribute rather than a custom one - Add GWS support for KFD - Vega powerplay improvements - Add CRC support for DCE 12 - SR-IOV support for new security policy - Various cleanups From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190529220944.14464-1-alexander.deucher@amd.com	2019-05-31 10:04:39 +10:00
Emily Deng	394e9a14c6	drm/amdgpu: Need to set the baco cap before baco reset For passthrough, after rebooted the VM, driver will do a baco reset before doing other driver initialization during loading driver. For doing the baco reset, it will first check the baco reset capability. So first need to set the cap from the vbios information or baco reset won't be enabled. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:47:42 -05:00
Alex Deucher	dbaa922b57	drm/amdgpu: use pcie_bandwidth_available rather than open coding it It does the same thing we were doing already. I though it needed work for gen3/4 speeds, but that seems to be covered already. Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:01 -05:00
Ori Messinger	5bb2353273	drm/amdgpu: Report firmware versions with sysfs v2 Firmware versions can be found as separate sysfs files at: /sys/class/drm/cardX/device/fw_version (where X is the card number) The firmware versions are displayed in hexadecimal. v2: Moved sysfs files to subfolder Signed-off-by: Ori Messinger <ori.messinger@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:52 -05:00
xinhui pan	5e6932fe31	drm/amdgpu: enable ras suspend/resume suspend/resume will change ras state behind us. Let driver get notified. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:51 -05:00
xinhui pan	511fdbc33a	drm/amdgpu: ras support suspend/resume add ras suspend function. rename ras_post_init to amdgpu_ras_resume. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:51 -05:00
Trigger Huang	2d11fd3f54	drm/amdgpu: initialize PSP before IH under SR-IOV In order to support new PSP feature that PSP may provide interface to program IH CNTL register, initialize PSP before IH under Vega10 SR-IOV VF Signed-off-by: Trigger Huang <Trigger.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:50 -05:00
Trigger Huang	78d4811267	drm/amdgpu: init vega10 SR-IOV reg access mode Set different register access mode according to the features provided by firmware Signed-off-by: Trigger Huang <Trigger.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:50 -05:00
xinhui pan	e79a04d531	drm/amdgpu: gpu reset will run ras post init ras need initialize proper state after late init Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:50 -05:00
xinhui pan	7c04ca50b0	drm/amdgpu: gpu reset will run late_init ras need late init to initialize proper state. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:50 -05:00
Kent Russell	dcea6e65d4	drm/amdgpu: Add PCIe replay count sysfs file Add a sysfs file for reporting the number of PCIe replays (NAKs). This returns the sum of NAKs received and NAKs generated Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:48 -05:00
Evan Quan	912dfc846a	drm/amdgpu: enable separate timeout setting for every ring type V4 Every ring type can have its own timeout setting. - V2: update lockup_timeout parameter format and cosmetic fixes - V3: invalidate 0 and negative values - V4: update lockup_timeout parameter format Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:48 -05:00
Sean Paul	374ed54293	Merge drm/drm-next into drm-misc-next Backmerging 5.2-rc1 to -misc-next for robher Signed-off-by: Sean Paul <seanpaul@chromium.org>	2019-05-22 16:08:21 -04:00
Maarten Lankhorst	752c4f3c1d	Merge remote-tracking branch 'drm/drm-next' into drm-misc-next Requested for backmerging airlied's drm-legacy cleanup. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2019-05-09 10:19:03 +02:00
Linus Torvalds	a2d635decb	drm pull request for 5.2 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJc04M6AAoJEAx081l5xIa+SJgP/0uIgIOM53vPpydgmr+2IEHF jbDqrd+mipgNriRVHjDsWdUHCUNtyhB7YEBCMrj3mY0rRFI7FlQQf4lOwYGoHiKP 4JZg4kwC37997lFXl1uabGj3DmJLtxKL2/D15zCH/uLe+2EDzWznP6NVdFT3WK0P YKZQCWT19PWSsLoBRPutWxkmop4AYvkqE0a6vXUlJlFYZK3Bbytx6/179uWKfiX5 ZkKEEtx1XiDAvcp5gBb6PISurycrBY0e/bkPBnK3ES5vawMbTU5IrmWOrQ4D8yOd z9qOVZawZ6+b2XBDgBWjQ9bM7I5R7Il1q/LglYEaFI9+wHUnlUdDSm6ft5/5BiCZ fqgkh5Bj2iEsajbSsacoljMOpxpYPqj63mqc+7fAGXF34V+B+9U1bpt8kCbMKowf 7Abb7IuiCR6vLDapjP6VqTMvdQ4O466OEAN83ULGFTdmMqYYH4AxaIwc+xcAk/aP RNq7/RHhh4FRynRAj9fCkGlF3ArnM88gLINwWuEQq4SClWGcvdw7eaHpwWo77c4g iccCnTLqSIg5pDVu07AQzzBlW6KulWxh5o72x+Xx+EXWdYUDHQ1SlNs11bSNUBV1 5MkrzY2GuD+NFEjsXJEDIPOr40mQOyJCXnxq8nXPsz/hD9kHeJPvWn3J3eVKyb5B Z6/knNqM0BDn3SaYR/rD =YFiQ -----END PGP SIGNATURE----- Merge tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "This has two exciting community drivers for ARM Mali accelerators. Since ARM has never been open source friendly on the GPU side of the house, the community has had to create open source drivers for the Mali GPUs. Lima covers the older t4xx and panfrost the newer 6xx/7xx series. Well done to all involved and hopefully this will help ARM head in the right direction. There is also now the ability if you don't have any of the legacy drivers enabled (pre-KMS) to remove all the pre-KMS support code from the core drm, this saves 10% or so in codesize on my machine. i915 also enable Icelake/Elkhart Lake Gen11 GPUs by default, vboxvideo moves out of staging. There are also some rcar-du patches which crossover with media tree but all should be acked by Mauro. Summary: uapi changes: - Colorspace connector property - fourcc - new YUV formts - timeline sync objects initially merged - expose FB_DAMAGE_CLIPS to atomic userspace new drivers: - vboxvideo: moved out of staging - aspeed: ASPEED SoC BMC chip display support - lima: ARM Mali4xx GPU acceleration driver support - panfrost: ARM Mali6xx/7xx Midgard/Bitfrost acceleration driver support core: - component helper docs - unplugging fixes - devm device init - MIPI/DSI rate control - shmem backed gem objects - connector, display_info, edid_quirks cleanups - dma_buf fence chain support - 64-bit dma-fence seqno comparison fixes - move initial fb config code to core - gem fence array helpers for Lima - ability to remove legacy support code if no drivers requires it (removes 10% of drm.ko size) - lease fixes ttm: - unified DRM_FILE_PAGE_OFFSET handling - Account for kernel allocations in kernel zone only panel: - OSD070T1718-19TS panel support - panel-tpo-td028ttec1 backlight support - Ronbo RB070D30 MIPI/DSI - Feiyang FY07024DI26A30-D MIPI-DSI panel - Rocktech jh057n00900 MIPI-DSI panel i915: - Comet Lake (Gen9) PCI IDs - Updated Icelake PCI IDs - Elkhartlake (Gen11) support - DP MST property addtions - plane and watermark fixes - Icelake port sync and VEBOX disable fixes - struct_mutex usage reduction - Icelake gamma fix - GuC reset fixes - make mmap more asynchronous - sound display power well race fixes - DDI/MIPI-DSI clocks for Icelake - Icelake RPS frequency changing support - Icelake workarounds amdgpu: - Use HMM for userptr - vega20 experimental smu11 support - RAS support for vega20 - BACO support for vega12 + fixes for vega20 - reworked IH interrupt handling - amdkfd RAS support - Freesync improvements - initial timeline sync object support - DC Z ordering fixes - NV12 planes support - colorspace properties for planes= - eDP opts if eDP already initialized nouveau: - misc fixes etnaviv: - misc fixes msm: - GPU zap shader support expansion - robustness ABI addition exynos: - Logging cleanups tegra: - Shared reset fix - CPU cache maintenance fix cirrus: - driver rewritten using simple helpers meson: - G12A support vmwgfx: - Resource dirtying management improvements - Userspace logging improvements virtio: - PRIME fixes rockchip: - rk3066 hdmi support sun4i: - DSI burst mode support vc4: - load tracker to detect underflow v3d: - v3d v4.2 support malidp: - initial Mali D71 support in komeda driver tfp410: - omap related improvement omapdrm: - drm bridge/panel support - drop some omap specific panels rcar-du: - Display writeback support" * tag 'drm-next-2019-05-09' of git://anongit.freedesktop.org/drm/drm: (1507 commits) drm/msm/a6xx: No zap shader is not an error drm/cma-helper: Fix drm_gem_cma_free_object() drm: Fix timestamp docs for variable refresh properties. drm/komeda: Mark the local functions as static drm/komeda: Fixed warning: Function parameter or member not described drm/komeda: Expose bus_width to Komeda-CORE drm/komeda: Add sysfs attribute: core_id and config_id drm: add non-desktop quirk for Valve HMDs drm/panfrost: Show stored feature registers drm/panfrost: Don't scream about deferred probe drm/panfrost: Disable PM on probe failure drm/panfrost: Set DMA masks earlier drm/panfrost: Add sanity checks to submit IOCTL drm/etnaviv: initialize idle mask before querying the HW db drm: introduce a capability flag for syncobj timeline support drm: report consistent errors when checking syncobj capibility drm/nouveau/nouveau: forward error generated while resuming objects tree drm/nouveau/fb/ramgk104: fix spelling mistake "sucessfully" -> "successfully" drm/nouveau/i2c: Disable i2c bus access after ->fini() drm/nouveau: Remove duplicate ACPI_VIDEO_NOTIFY_PROBE definition ...	2019-05-08 21:35:19 -07:00
Dave Airlie	422449238e	Merge branch 'drm-next-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-next - SR-IOV fixes - Raven flickering fix - Misc spelling fixes - Vega20 power fixes - Freesync improvements - DC fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190502193020.3562-1-alexander.deucher@amd.com	2019-05-03 10:31:07 +10:00
Andrey Grodzovsky	1d721ed679	drm/amdgpu: Avoid HW reset if guilty job already signaled. Also reject TDRs if another one already running. v2: Stop all schedulers across device and entire XGMI hive before force signaling HW fences. Avoid passing job_signaled to helper fnctions to keep all the decision making about skipping HW reset in one place. v3: Fix SW sched. hang after non HW reset. sched.hw_rq_count has to be balanced against it's decrement in drm_sched_stop in non HW reset case. v4: rebase v5: Revert v3 as we do it now in sceduler code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-6-git-send-email-andrey.grodzovsky@amd.com	2019-05-02 15:54:32 -05:00
Christian König	5918045c4e	drm/scheduler: rework job destruction We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com	2019-05-02 15:45:48 -05:00
Andrey Grodzovsky	77e7f82985	drm/amdgpu: Change VRAM lost print from ERR to INF It's normal for VRAM to lost during GPU reset and so change the log level to INFO to avoid confusing users. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-23 12:09:26 -05:00
Dave Airlie	f06ddb5309	Linux 5.1-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlyzsYgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGMw0H/ir42KJiABBKSETD 0d38qXVclAI/123zl8EkSfDrBKOsuIpXUDxzKeoDMhMkiurMpK6bbEOTPJAQMZJe nEYpq/bZQi+vO8Q/pMMpaC3ExlIRosd0JAR7TyDUh5ZAeeMuDNzmvMk/DPxXPbNt 0P1FWePDa7908ajCOW1T8ZrB9Ak8boo7TKkF3LBb00ks1mEkyp/l74MKOHdu+HYn XIwncX/Jotl4BrKdNC2f/NXYLYk6MrJDGug8TxuHgIqiMWhhrcSqbxU1ri7iqFXB cBYdFo6ZJ8CWHux8/5LY5CMjSqEtzKha2Ohuhy3MMu1RsICyFLQtHnxHJ1ytLSBt DOPcDQ0= =CEUD -----END PGP SIGNATURE----- BackMerge v5.1-rc5 into drm-next Need rc5 for udl fix to add udl cleanups on top. Signed-off-by: Dave Airlie <airlied@redhat.com>	2019-04-15 15:51:49 +10:00
wentalou	b575f10dbd	drm/amdgpu: shadow in shadow_list without tbo.mem.start cause page fault in sriov TDR shadow was added into shadow_list by amdgpu_bo_create_shadow. meanwhile, shadow->tbo.mem was not fully configured. tbo.mem would be fully configured by amdgpu_vm_sdma_map_table until calling amdgpu_vm_clear_bo. If sriov TDR occurred between amdgpu_bo_create_shadow and amdgpu_vm_sdma_map_table, amdgpu_device_recover_vram would deal with shadow without tbo.mem.start. Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-12 11:23:49 -05:00
Yintian Tao	bb5a2bdf36	drm/amdgpu: support dpm level modification under virtualization v3 Under vega10 virtualuzation, smu ip block will not be added. Therefore, we need add pp clk query and force dpm level function at amdgpu_virt_ops to support the feature. v2: add get_pp_clk existence check and use kzalloc to allocate buf v3: return -ENOMEM for allocation failure and correct the coding style Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-10 13:53:27 -05:00
wentalou	1712fb1a2f	drm/amdgpu: amdgpu_device_recover_vram always failed if only one node in shadow_list amdgpu_bo_restore_shadow would assign zero to r if succeeded. r would remain zero if there is only one node in shadow_list. current code would always return failure when r <= 0. restart the timeout for each wait was a rather problematic bug as well. The value of tmo SHOULD be changed, otherwise we wait tmo jiffies on each loop. Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-04 10:22:06 -05:00
shaoyunl	df399b0641	drm/amdgpu: XGMI pstate switch initial support Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-27 22:40:43 -05:00
Evan Quan	fed184e905	drm/amdgpu: trivial typo fix "error" was not correctly spelled. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-27 22:40:05 -05:00
Chengming Gui	ad51c46eec	drm/amd/amdgpu: fix PCIe dpm feature issue (v3) use pcie_bandwidth_available to get real link state to update pcie table. v2: fix incorrect initialized return value v3: expand the fetching method about the link width to all asics. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-27 22:31:56 -05:00
Christian König	86f7bae5cf	drm/amdgpu: revert "XGMI pstate switch initial support" This reverts commit `9b638f9751`. Adding this to the mapping is complete nonsense and the whole implementation looks racy. This patch wasn't thoughtfully reviewed and should be reverted for now. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Liu, Shaoyun <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-21 14:05:01 -05:00
Huang Rui	005440066f	drm/amdgpu: enable gfxoff again on raven series (v2) This patch enables gfxoff and stutter mode again, since we take more testing on raven series. For raven2 and picasso, we can enable it directly. And for raven, we need check the RLC/SMC ucode version cannot be less than #531/0x1e45. v2: add smc version checking for raven. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Tested-by: Likun Gao <Likun.Gao@amd.com> (v2) Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-20 23:39:49 -05:00
Wentao Lou	f81e8d532a	drm/amdkfd/sriov:Put the pre and post reset in exclusive mode v2 add amdgpu_amdkfd_pre_reset and amdgpu_amdkfd_post_reset inside amdgpu_device_reset_sriov. Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-20 23:39:47 -05:00
xinhui pan	108c6a6309	drm/amdgpu: add new ras workflow control flags add ras post init function. Do some initialization after all IP have finished their late init. Add new member flags which will control the ras work flow. For now, vbios enable ras for us on boot. That might change in the future. So there should be a flag from vbios to tell us if ras is enabled or not on boot. Looks like there is no such info now. Other bits of the flags are reserved to control other parts of ras. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:52 -05:00
xinhui pan	2be4c4a9d4	drm/amdgpu: reserve bad pages during recovery Mark vram pages with errors as bad and prevent the driver from using them. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:50 -05:00
xinhui pan	c030f2e416	drm/amdgpu: add amdgpu_ras.c to support ras (v2) add obj management. add feature control. add debugfs infrastructure. add sysfs infrastructure. add IH infrastructure. add recovery infrastructure. It is a framework. Other IPs need call amdgpu_ras_xxx function instead of psp_ras_xxx functions. v2: squash in warning fixes Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:50 -05:00
Andrey Grodzovsky	533aed278a	drm/amdgpu: Move IB pool init and fini v2 Problem: Using SDMA for TLB invalidation in certain ASICs exposed a problem of IB pool not being ready while SDMA already up on Init and already shutt down while SDMA still running on Fini. This caused IB allocation failure. Temproary fix was commited into a bringup branch but this is the generic fix. Fix: Init IB pool rigth after GMC is ready but before SDMA is ready. Do th opposite for Fini. v2: Remove restriction on SDMA early init and move amdgpu_ib_pool_fini Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:49 -05:00
shaoyunl	9b638f9751	drm/amdgpu: XGMI pstate switch initial support Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:36:48 -05:00
Likun Gao	3b94fb101f	drm/amd/powerplay: add limit of pp_feature for smu (v3) Move pp_feature from the struct of amd_powerplay to amdgpu_device. Add pp_feature limit for overdrive interface. v2: put pp_feature into struct amdgpu_pm. v3: merge feature_mask with pp_feature. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-03-19 15:04:02 -05:00
Dave Airlie	f4bc54b532	Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next Updates for 5.1: - GDS fixes - Add AMDGPU_CHUNK_ID_SCHEDULED_DEPENDENCIES interface - GPUVM fixes - PCIE DPM switching fixes for vega20 - Vega10 uclk DPM regression fix - DC Freesync fixes - DC ABM fixes - Various DC cleanups Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190208210214.27666-1-alexander.deucher@amd.com	2019-02-11 14:04:20 +10:00
Harish Kasiviswanathan	c53134577c	drm/amdgpu: Fix pci platform speed and width The new Vega series GPU cards have in-built bridges. To get the pcie speed and width supported by the platform walk the hierarchy and get the slowest link. Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-02-07 14:03:18 -05:00
Dave Airlie	37fdaa3390	drm-misc-next for 5.1: UAPI Changes: Cross-subsystem Changes: Core Changes: - Split out some part of drm_crtc_helper.h into drm_probe_helper.h - DRIVER_* flags improvements - New tasks on the TODO-list - Improvements to the documentation Driver Changes: - Continual of drmP.h removal in multiple drivers - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers - sun4i: Addition of the A23 support, multiple fixes for the tiled formats - atmel-hlcdc: Fix of clipping and rotation properties - qxl: various BO-related improvements, prime and generic fbdev emulation support - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output - New Sitronix ST7701 panel driver - New Kingdisplay KD097D04 panel driver - New LeMaker BL035-RGB-002 panel driver - New PDA 91-00156-A0 panel driver -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXFRb/wAKCRDj7w1vZxhR xXj2AQDFAkH0/tksJ+T6yrHZXQE7q/6fAWTlrDXLo7Nb1hhfswEA1AyaHmZJA0Ar u7S+4Gfzs/gisw9Jr4bqQaerEpYJxQA= =qumD -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2019-02-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.1: UAPI Changes: Cross-subsystem Changes: Core Changes: - Split out some part of drm_crtc_helper.h into drm_probe_helper.h - DRIVER_* flags improvements - New tasks on the TODO-list - Improvements to the documentation Driver Changes: - Continual of drmP.h removal in multiple drivers - Removal of FBINFO_(FLAG_)DEFAULT in multiple drivers - sun4i: Addition of the A23 support, multiple fixes for the tiled formats - atmel-hlcdc: Fix of clipping and rotation properties - qxl: various BO-related improvements, prime and generic fbdev emulation support - dw-hdmi: Support for HDMI2.0 2160p modes and YUV420 output - New Sitronix ST7701 panel driver - New Kingdisplay KD097D04 panel driver - New LeMaker BL035-RGB-002 panel driver - New PDA 91-00156-A0 panel driver Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190201144749.t3abxvguhstu6bcl@flea	2019-02-04 14:42:34 +10:00
Dave Airlie	e09191d360	Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next New stuff for 5.1. amdgpu: - DC bandwidth formula updates - Support for DCC on scanout surfaces - Support for multiple IH rings on soc15 asics - Fix xgmi locking - Add sysfs interface to get pcie usage stats - Simplify DC i2c/aux code - Initial support for BACO on vega10/20 - New runtime SMU feature debug interface - Expand existing sysfs power interfaces to new clock domains - Handle kexec properly - Simplify IH programming - Rework doorbell handling across asics - Drop old CI DPM implementation - DC page flipping fixes - Misc SR-IOV fixes amdkfd: - Simplify the interfaces between amdkfd and amdgpu ttm: - Add a callback to notify the driver when the lru changes sched: - Refactor mirror list handling - Rework hw fence processing Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190125231517.26268-1-alexander.deucher@amd.com	2019-02-01 09:34:20 +10:00
Andrey Grodzovsky	222b5f0441	drm/sched: Refactor ring mirror list handling. Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-25 16:15:36 -05:00
wentalou	f14899fd2a	drm/amdgpu: sriov should skip asic_reset in device_init sriov would meet guest driver load failure, if calling amdgpu_asic_reset in amdgpu_device_init. sriov should skip asic_reset in device_init. Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-25 16:15:35 -05:00
Daniel Vetter	fcd70cd36b	drm: Split out drm_probe_helper.h Having the probe helper stuff (which pretty much everyone needs) in the drm_crtc_helper.h file (which atomic drivers should never need) is confusing. Split them out. To make sure I actually achieved the goal here I went through all drivers. And indeed, all atomic drivers are now free of drm_crtc_helper.h includes. v2: Make it compile. There was so much compile fail on arm drivers that I figured I'll better not include any of the acks on v1. v3: Massive rebase because i915 has lost a lot of drmP.h includes, but not all: Through drm_crtc_helper.h > drm_modeset_helper.h -> drmP.h there was still one, which this patch largely removes. Which means rolling out lots more includes all over. This will also conflict with ongoing drmP.h cleanup by others I expect. v3: Rebase on top of atomic bochs. v4: Review from Laurent for bridge/rcar/omap/shmob/core bits: - (re)move some of the added includes, use the better include files in other places (all suggested from Laurent adopted unchanged). - sort alphabetically v5: Actually try to sort them, and while at it, sort all the ones I touch. v6: Rebase onto i915 changes. v7: Rebase once more. Acked-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com> Acked-by: CK Hu <ck.hu@mediatek.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: virtualization@lists.linux-foundation.org Cc: etnaviv@lists.freedesktop.org Cc: linux-samsung-soc@vger.kernel.org Cc: intel-gfx@lists.freedesktop.org Cc: linux-mediatek@lists.infradead.org Cc: linux-amlogic@lists.infradead.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: spice-devel@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-rockchip@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com Cc: linux-tegra@vger.kernel.org Cc: xen-devel@lists.xen.org Link: https://patchwork.freedesktop.org/patch/msgid/20190117210334.13234-1-daniel.vetter@ffwll.ch	2019-01-24 13:20:42 +01:00
Alex Deucher	95e8e59ec4	drm/amdgpu: check if we need to reset at init time (v2) To deal with situations like kexec or GPU VM passthrough where the device may have been used previously without a proper GPU reset between. v2: rebase bug: https://bugs.freedesktop.org/show_bug.cgi?id=108585 bug: https://bugs.freedesktop.org/show_bug.cgi?id=108754 Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:57 -05:00
Tom St Denis	22d6575b8d	drm/amd/amdgpu: add missing mutex lock to amdgpu_get_xgmi_hive() (v3) v2: Move locks around in other functions so that this function can stand on its own. Also only hold the hive specific lock for add/remove device instead of the driver global lock so you can't add/remove devices in parallel from one hive. v3: add reset_lock Acked-by: Shaoyun.liu < Shaoyun.liu@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:53 -05:00
Emily Deng	72d3f59205	drm/amdgpu/sriov: For finishing routine send rel event after init failed When init fail, send rel init, req_fini and rel_fini to host for the finishing routine. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:50 -05:00
wentalou	0aaeefccb4	drm/amdgpu: distinguish early and late re-init log in sriov distinguish ip_reinit_early_sriov and ip_reinit_late_sriov by different log RE-INIT-early and RE-INIT-late Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:49 -05:00
Emily Deng	d3c117e564	drm/amdgpu/sriov:Correct pfvf exchange logic The pfvf exchange need be in exclusive mode. And add pfvf exchange in gpu reset. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-By: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:24 -05:00
Emily Deng	91334223b2	drm/amdgpu/virtual_dce: No need to pin the cursor bo For virtual display feature, no need to pin cursor bo. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-01-14 15:04:23 -05:00
Daniel Vetter	c2d88e06bc	drm: Move the legacy kms disable_all helper to crtc helpers It's not a core function, and the matching atomic functions are also not in the core. Plus the suspend/resume helper is also already there. Needs a tiny bit of open-coding, but less midlayer beats that I think. v2: Rebase onto ast (which gained a new user). Cc: Sam Bobroff <sbobroff@linux.ibm.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sean Paul <sean@poorly.run> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <maxime.ripard@bootlin.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Cc: Rex Zhu <Rex.Zhu@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Shaoyun Liu <Shaoyun.Liu@amd.com> Cc: Monk Liu <Monk.Liu@amd.com> Cc: nouveau@lists.freedesktop.org Cc: amd-gfx@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20181217194303.14397-4-daniel.vetter@ffwll.ch	2019-01-11 22:54:29 +01:00
wentalou	7b184b0061	drm/amdgpu: kfd_pre_reset outside req_full_gpu cause sriov hang XGMI hive put kfd_pre_reset into amdgpu_device_lock_adev, but outside req_full_gpu of sriov. It would make sriov hang during reset. Signed-off-by: Wentao Lou <Wentao.Lou@amd.com> Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-14 12:04:38 -05:00
Andrey Grodzovsky	fc42d47ce0	drm/amdgpu: Enable GPU recovery by default for CI I retested Bonaire (gfx7 dGPU) and it works fine. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-12 14:26:40 -05:00
Alex Deucher	223577753b	drm/amdgpu/si: fix SI after doorbell rework SI does not use doorbells, move asic doorbell init later asic check. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=108920 Reviewed-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-05 17:49:50 -05:00
Andrey Grodzovsky	d4535e2c01	drm/amdgpu: Implement concurrent asic reset for XGMI. Use per hive wq to concurrently send reset commands to all nodes in the hive. v2: Switch to system_highpri_wq after dropping dedicated queue. Fix non XGMI code path KASAN error. Stop the hive reset for each node loop if there is a reset failure on any of the nodes. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-03 11:15:14 -05:00
Andrey Grodzovsky	a82400b57a	drm/amdgpu: Handle xgmi device removal. XGMI hive has some resources allocted on device init which needs to be deallocated when the device is unregistered. v2: Remove creation of dedicated wq for XGMI hive reset. v3: Use the gmc.xgmi.supported flag Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-12-03 11:15:08 -05:00
Oak Zeng	88dc26e46b	drm/amdgpu: Fix num_doorbell calculation issue When paging queue is enabled, it use the second page of doorbell. The AMDGPU_DOORBELL64_MAX_ASSIGNMENT definition assumes all the kernel doorbells are in the first page. So with paging queue enabled, the total kernel doorbell range should be original num_doorbell plus one page (0x400 in dword), not *2. Signed-off-by: Oak Zeng <ozeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-30 12:01:04 -05:00
Andrey Grodzovsky	26bc534094	drm/amdgpu: Refactor GPU reset for XGMI hive case For XGMI hive case do reset in steps where each step iterates over all devs in hive. This especially important for asic reset since all PSP FW in hive must come up within a limited time (around 1 sec) to properply negotiate the link. Do this by refactoring amdgpu_device_gpu_recover and amdgpu_device_reset into pre_asic_reset, asic_reset and post_asic_reset functions where is part is exectued for all the GPUs in the hive before going to the next step. v2: Update names for amdgpu_device_lock/unlock functions. v3: Introduce per hive locking to avoid multiple resets for GPUs in same hive. v4: Remove delayed_workqueue()/ttm_bo_unlock_delayed_workqueue() - they are copy & pasted over from radeon and on amdgpu there isn't any reason for that any more. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-28 15:55:36 -05:00
Andrey Grodzovsky	5183411b56	drm/amdgpu: Refactor amdgpu_xgmi_add_device This is prep work for updating each PSP FW in hive after GPU reset. Split into build topology SW state and update each PSP FW in the hive. Save topology and count of XGMI devices for reuse. v2: Create seperate header for XGMI. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-28 15:55:35 -05:00
Oak Zeng	9564f1928e	drm/amdgpu: Use asic specific doorbell index instead of macro definition ASIC specific doorbell layout is used instead of enum definition Signed-off-by: Oak Zeng <ozeng@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-28 15:55:33 -05:00
Oak Zeng	6585661ddd	drm/amdgpu: Call doorbell index init on device initialization Also call functioin amdgpu_device_doorbell_init after amdgpu_device_ip_early_init because the former depends on the later to set up asic-specific init_doorbell_index function Signed-off-by: Oak Zeng <ozeng@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-28 15:55:32 -05:00
Philip Yang	ec3db8a63d	drm/amdgpu: enable paging queue doorbell support v4 Because increase SDMA_DOORBELL_RANGE to add new SDMA doorbell for paging queue will break SRIOV, instead we can reserve and map two doorbell pages for amdgpu, paging queues doorbell index use same index as SDMA gfx queues index but on second page. For Vega20, after we change doorbell layout to increase SDMA doorbell for 8 SDMA RLC queues later, we could use new doorbell index for paging queue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-20 14:04:58 -05:00
Hawking Zhang	3e2e2ab554	drm/amdgpu/psp: initialize xgmi session (v2) Setup and tear down xgmi as part of psp. v2: - make psp_xgmi_terminate static - squash in: drm/amdgpu: only issue xgmi cmd when it is enabled drm/amdgpu/psp: terminate xgmi ta in suspend and hw_fini phase Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-06 14:02:43 -05:00
Rex Zhu	1e256e2762	drm/amdgpu: Refine CSA related functions There is no functional changes, Use function arguments for SRIOV special variables which is hardcode in those functions. so we can share those functions in baremetal. Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:48 -05:00
Andrey Grodzovsky	3ba7b418f1	drm/amdgpu: Enable default GPU reset for dGPU on gfx8/9 v3 After testing looks like these subset of ASICs has GPU reset working for the most part. Enable reset due to job timeout. v2: Switch from GFX version to ASIC type. v3: Fix identation Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:23 -05:00
Christian König	9d064be1e6	drm/amdgpu: revert "enable gfxoff in non-sriov and stutter mode by default" This is still completely breaking my Raven system. This reverts commit cdf2f910fa969adca1b0e3ad2b487821233dc038. Revert until we sort out the sbios and firmware combinations that work correctly. bug: https://bugs.freedesktop.org/show_bug.cgi?id=108606 Cc: stable@vger.kernel.org # v4.19 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-01 09:56:56 -05:00
Andrey Grodzovsky	734afd4b21	drm/amdgpu: Fix skipping hangged job reset during gpu recover. Problem: During GPU recover DAL would hang in amdgpu_pm_compute_clocks->amdgpu_fence_wait_empty Fix: Turns out there was a typo introduced by `3320b8d` drm/amdgpu: remove job->ring which caused skipping amdgpu_fence_driver_force_completion and so the hangged job was never force signaled and this would cause the hang later in DAL. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-01 09:51:33 -05:00
Emily Deng	91eec27ebb	drm/amdgpu: Fix null pointer amdgpu_device_fw_loading Need to check adev->powerplay.pp_funcs. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-22 14:40:54 -05:00
Rex Zhu	7a3e0bb2a5	drm/amdgpu: Load fw between hw_init/resume_phase1 and phase2 Extract the function of fw loading out of powerplay. Do fw loading between hw_init/resuem_phase1 and phase2 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:49:21 -05:00
Rex Zhu	0a4f25205e	drm/amdgpu: split ip hw_init into 2 phases We need to do some IPs earlier to deal with ordering issues similar to how resume is split into two phases. Will do fw loading via smu/psp between the two phases. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:49:09 -05:00
Rex Zhu	c8963ea4ce	drm/amdgpu: Split amdgpu_ucode_init/fini_bo into two functions 1. one is for create/free bo when init/fini 2. one is for fill the bo before fw loading the ucode bo only need to be created when load driver and free when driver unload. when resume/reset, driver only need to re-fill the bo if the bo is allocated in vram. Suggested by Christian. v2: Return error when bo create failed. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:48:52 -05:00
Rex Zhu	a2d31dc3cf	drm/amdgpu: Check late_init status before set cg/pg state Fix cg/pg unexpected set in hw init failed case. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:48:44 -05:00
Rex Zhu	73f847dbab	drm/amdgpu: Refine function amdgpu_device_ip_late_init 1. only call late_init when hw_init successful, so check status.hw instand of status.valid in late_init. 2. set status.late_initialized true if late_init was not implemented. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-10 14:48:28 -05:00
Rex Zhu	44779b43f1	drm/amdgpu: Move gfx flag in_suspend to adev Move in_suspend flag to adev from gfx, so can be used in other ip blocks, also keep consistent with gpu_in_reset flag. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-09 17:05:33 -05:00
Evan Quan	b55c9e7a11	drm/amd/powerplay: helper interfaces for MGPU fan boost feature MGPU fan boost feature is enabled only when two or more dGPUs in the system. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-10-09 16:59:56 -05:00
Christian König	403009bfba	drm/amdgpu: fix shadow BO restoring Don't grab the reservation lock any more and simplify the handling quite a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-09-19 12:38:41 -05:00

... 2 3 4 5 6 ...

821 Commits