Commit Graph

81226 Commits

Author SHA1 Message Date
Maarten Lankhorst
6945c53bc7 drm/i915: Add locking to i915_gem_evict_vm(), v3.
i915_gem_evict_vm will need to be able to evict objects that are
locked by the current ctx. By testing if the current context already
locked the object, we can do this correctly. This allows us to
evict the entire vm even if we already hold some objects' locks.

Previously, this was spread over several commits, but it makes
more sense to commit the changes to i915_gem_evict_vm separately
from the changes to i915_gem_evict_something() and
i915_gem_evict_for_node().

Changes since v1:
- Handle evicting dead objects better.
Changes since v2:
- Use for_i915_gem_ww in igt_evict_vm. (Thomas)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[mlankhorst: Fix up doc warning.]
Link: https://patchwork.freedesktop.org/patch/msgid/20220117075604.131477-1-maarten.lankhorst@linux.intel.com
2022-01-18 12:00:30 +01:00
Maarten Lankhorst
e849f7e708 drm/i915: Call i915_gem_evict_vm in vm_fault_gtt to prevent new ENOSPC errors, v2.
Now that we cannot unbind kill the currently locked object directly
because we're removing short term pinning, we may have to unbind the
object from gtt manually, using a i915_gem_evict_vm() call.

Changes since v1:
- Remove -ENOSPC warning, can still happen with concurrent mmaps
  where we can't unbind the other mmap because of the lock held.
  This  fixes the gem_mmap_gtt@cpuset tests.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-2-maarten.lankhorst@linux.intel.com
2022-01-18 12:00:29 +01:00
Tvrtko Ursulin
647bfd26bf Merge drm/drm-next into drm-intel-gt-next
Maarten needs backmerge to account for header file renames/changes which
landed via drm-intel-next and are interfering with his pinning work.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-18 10:54:02 +00:00
Colin Ian King
3a5d604f90 i915: make array flex_regs static const
Don't populate the read-only array flex_regs on the stack but
instead it static const. Also makes the object code a little smaller.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220112223435.949071-1-colin.i.king@gmail.com
2022-01-14 15:25:18 +00:00
Daniel Vetter
4efdddbce7 Merge tag 'amd-drm-next-5.17-2022-01-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.17-2022-01-12:

amdgpu:
- SR-IOV fixes
- Suspend/resume fixes
- Display fixes
- DMCUB fixes
- DP alt mode fixes
- RAS fixes
- UBSAN fix
- Navy Flounder VCN fix
- ttm resource manager cleanup
- default_groups change for kobj_type
- vkms fix
- Aldebaran fixes

amdkfd:
- SDMA ECC interrupt fix
- License clarification
- Pointer check fix
- DQM fixes for hawaii
- default_groups change for kobj_type
- Typo fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220113030537.5758-1-alexander.deucher@amd.com
2022-01-14 15:42:28 +01:00
Daniel Vetter
820e690e4e Merge tag 'drm-misc-next-fixes-2022-01-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
* atomic helpers: Fix error messages
 * mipi-dbi: Fix buffer mapping

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YeGHu7qU92pjuQOn@linux-uq9g
2022-01-14 15:41:48 +01:00
Daniel Vetter
71e4a70290 Merge tag 'drm-misc-fixes-2022-01-14' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Two DT bindings fixes for meson, a device refcounting fix for sun4i, a
probe fix for vga16fb, a locking fix for the CMA dma-buf heap and a
compilation fix for ttm.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[danvet: I made sure I have exactly the same conflict resolution as
Linus in 8d0749b4f8 ("Merge tag 'drm-next-2022-01-07' of
git://anongit.freedesktop.org/drm/drm") to avoid further conflict fun.
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114125454.zs46ny52lrxk3ljz@houat
2022-01-14 15:17:17 +01:00
Thomas Zimmermann
5d474cc501 drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy
Set the source-buffer address after mapping the buffer into the
kernel's address space. Makes MIPI DBI helpers work again.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: c47160d8ed ("drm/mipi-dbi: Remove dependency on GEM CMA helper library")
Reviewed-by: Noralf Trønnes <noralf@tronnes.org>
Reported-by: Noralf Trønnes <noralf@tronnes.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111132634.18302-1-tzimmermann@suse.de
(cherry picked from commit 9d31993451)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-01-14 14:43:02 +01:00
Claudio Suarez
bd672b7559 drm: fix error found in some cases after the patch d1af5cd86997
The patch d1af5cd86997 ("drm: get rid of DRM_DEBUG_* log
calls in drm core, files drm_a*.c") fails when the drm_device
cannot be found in the parameter plane_state->crtc.
Fix it using plane_state->plane.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: 6e22dc3583 ("drm: get rid of DRM_DEBUG_* log calls in drm core, files drm_a*.c")
Signed-off-by: Claudio Suarez <cssk@net-c.es>
[danvet: fix Fixes: line]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/YaiXEARd8z2C463h@gineta.localdomain
(cherry picked from commit 521d459b19)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2022-01-14 14:31:54 +01:00
Daniel Vetter
ad783ff5a2 Merge tag 'drm-misc-next-fixes-2022-01-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
* Fix use of CRTC state's active vs enable in atomic helper

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Yd/i/zj9vEHisSSB@linux-uq9g
2022-01-14 13:39:56 +01:00
Daniel Vetter
8f18a987ca Merge tag 'drm-intel-next-fixes-2022-01-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- Hold runtime PM wakelock during PXP unbind (Juston Li)
- Three fixes for the TTM backend fault handling (Matthew Auld)
- Make sure to unmap when purging in the TTM backend (Matthew Auld)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yd/xzyCM87rfrwQT@tursulin-mobl2
2022-01-14 13:34:40 +01:00
Johannes Berg
016017a195 drm/ttm: fix compilation on ARCH=um
Even if it's probably not really useful, it can get selected
by e.g. randconfig builds, and then failing to compile is an
annoyance. Unfortunately, it's hard to fix in Kconfig, since
DRM_TTM is selected by many things that don't really depend
on any specific architecture, and just depend on PCI (which
is indeed now available in ARCH=um via simulation/emulation).

Fix this in the code instead by just ifdef'ing the relevant
two lines that depend on "real X86".

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20211220111519.a4c8c6eff702.Ie4cf4e68698f6a9f546b83379bc52c266504424f@changeid
2022-01-14 13:30:56 +01:00
Kent Russell
5eb877b282 drm/amdkfd: Fix ASIC name typos
Three misspelled ASICs in comments here, so fix the spelling

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Felix Kuehling
6f4cb84ae0 drm/amdkfd: Fix DQM asserts on Hawaii
start_nocpsch would never set dqm->sched_running on Hawaii due to an
early return statement. This would trigger asserts in other functions
and end up in inconsistent states.

Bug: https://github.com/RadeonOpenCompute/ROCm/issues/1624
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Harry Wentland
dc5d4aff2e drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
For some reason this file isn't using the appropriate register
headers for DCN headers, which means that on DCN2 we're getting
the VIEWPORT_DIMENSION offset wrong.

This means that we're not correctly carving out the framebuffer
memory correctly for a framebuffer allocated by EFI and
therefore see corruption when loading amdgpu before the display
driver takes over control of the framebuffer scanout.

Fix this by checking the DCE_HWIP and picking the correct offset
accordingly.

Long-term we should expose this info from DC as GMC shouldn't
need to know about DCN registers.

Cc: stable@vger.kernel.org
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Tao Zhou
15084a8e16 drm/amd/pm: only send GmiPwrDnControl msg on master die (v3)
PMFW only returns 0 on master die and sends NACK back on other dies for
the message.

v2: only send GmiPwrDnControl msg on master die instead of all
dies.
v3: remove the pointer check for get_socket_id and get_die_id as they
should be present on Aldebaran.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Guchun Chen
2096b74b1d drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt
This is observed in SRIOV case with virtual KMS as display.

_raw_spin_lock_irqsave+0x37/0x40
drm_handle_vblank+0x69/0x350 [drm]
? try_to_wake_up+0x432/0x5c0
? amdgpu_vkms_prepare_fb+0x1c0/0x1c0 [amdgpu]
drm_crtc_handle_vblank+0x17/0x20 [drm]
amdgpu_vkms_vblank_simulate+0x4d/0x80 [amdgpu]
__hrtimer_run_queues+0xfb/0x230
hrtimer_interrupt+0x109/0x220
__sysvec_apic_timer_interrupt+0x64/0xe0
asm_call_irq_on_stack+0x12/0x20

Fixes: 84ec374bd5 ("drm/amdgpu: create amdgpu_vkms (v4)")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Kelly Zytaruk <kelly.zytaruk@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:28 -05:00
Prike Liang
4eaf21b752 drm/amdgpu: not return error on the init_apu_flags
In some APU project we needn't always assign flags to identify each other,
so we may not need return an error.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Felix Kuehling
b121862c78 drm/amdkfd: Use prange->update_list head for remove_list
The remove_list head was only used for keeping track of existing ranges
that are to be removed from the svms->list. The update_list was used for
new or existing ranges that need updated attributes. These two cases are
mutually exclusive (i.e. the same range will never be on both lists).
Therefore we can use the update_list head to track the remove_list and
save another 16 bytes in the svm_range struct.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Felix Kuehling
ef3b4137aa drm/amdkfd: Use prange->list head for insert_list
There are seven list_heads in struct svm_range: list, update_list,
remove_list, insert_list, svm_bo_list, deferred_list, child_list. This
patch and the next one remove two of them that are redundant.

The insert_list head was only used for new ranges that are not on the
svms->list yet. So we can use that list head for keeping track of
new ranges before they get added, and use list_move_tail to move them
to the svms->list when ready.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Jiasheng Jiang
abfaf0eee9 drm/amdkfd: Check for null pointer after calling kmemdup
As the possible failure of the allocation, kmemdup() may return NULL
pointer.
Therefore, it should be better to check the 'props2' in order to prevent
the dereference of NULL pointer.

Fixes: 3a87177eb1 ("drm/amdkfd: Add topology support for dGPUs")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
José Expósito
978ffac878 drm/amd/display: invalid parameter check in dmub_hpd_callback
The function performs a check on the "adev" input parameter, however, it
is used before the check.

Initialize the "dev" variable after the sanity check to avoid a possible
NULL pointer dereference.

Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Addresses-Coverity-ID: 1493909 ("Null pointer dereference")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Rajneesh Bhardwaj
8b5da5a458 Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process"
This reverts commit fbcdbfde87.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Mario Limonciello
83293f7f3d drm/amd/display: reset dcn31 SMU mailbox on failures
Otherwise future commands may fail as well leading to downstream
problems that look like they stemmed from a timeout the first time
but really didn't.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:27 -05:00
Greg Kroah-Hartman
5fea167ec0 drm/amdkfd: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the amdkfd sysfs code to use default_groups field which has
been the preferred way since aa30f47cf6 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.

Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Greg Kroah-Hartman
7ff61cdcc8 drm/amdgpu: use default_groups in kobj_type
There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the amdgpu sysfs code to use default_groups field which has
been the preferred way since aa30f47cf6 ("kobject: Add support for
default attribute groups to kobj_type") so that we can soon get rid of
the obsolete default_attrs field.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Jonathan Kim <jonathan.kim@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: shaoyunl <shaoyun.liu@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Tom St Denis
4cc9f86f85 drm/amd/amdgpu: Add pcie indirect support to amdgpu_mm_wreg_mmio_rlc()
The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to
MMIO registers.  It didn't support registers beyond the BAR mapped MMIO
space.  This adds pcie indirect write support.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
575e55ee4f drm/amdgpu: recover gart table at resume
Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART table.

v3: remove gart recovery from other places
v2: pin gart at amdgpu_gart_table_vram_alloc()

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
ec6aae9711 drm/amdgpu: do not pass ttm_resource_manager to vram_mgr
Do not allow exported amdgpu_vram_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call vram_mgr functions.

v2: pass adev's vram_mgr instead of adev

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Nirmoy Das
ffb378fb30 drm/amdkfd: remove unused function
Remove unused amdgpu_amdkfd_get_vram_usage()

CC: Felix.Kuehling@amd.com

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes: dfcbe6d5f4 ("drm/amdgpu: Remove unused function pointers")
2022-01-11 15:44:26 -05:00
Nirmoy Das
1dd8b1b987 drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr
Do not allow exported amdgpu_gtt_mgr_*() to accept
any ttm_resource_manager pointer. Also there is no need
to force other module to call a ttm function just to
eventually call gtt_mgr functions.

v4: remove unused adev.
v3: upcast mgr from ttm resopurce manager instead of
getting it from adev.
v2: pass adev's gtt_mgr instead of adev.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Leslie Shi
62d5f9f711 drm/amdgpu: Unmap MMIO mappings when device is not unplugged
Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device
is unplugged to prevent crash in GPU initialization failure") makes call to
amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps
MMIO mappings even when device is not unplugged.

v2: Add condition of drm_dev_enter() to deleted unmaps in patch
"drm/amdgpu: Unmap all MMIO mappings"

Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:26 -05:00
Peng Ju Zhou
6638391b9f drm/amdgpu: Enable second VCN for certain Navy Flounder.
Certain Navy Flounder cards have 2 VCNs, enable it.

Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
Mario Limonciello
63ad5371cd drm/amd/display: explicitly set is_dsc_supported to false before use
When UBSAN is enabled a case is shown on unplugging the display that
this variable hasn't been initialized by `update_dsc_caps`, presumably
when the display was unplugged it wasn't copied from the DPCD.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1956497
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
Jiawei Gu
b54ce6c92c drm/amdgpu: Clear garbage data in err_data before usage
Memory of err_data should be cleaned before usage
when there're multiple entry in ras ih.
Otherwise garbage data from last loop will be used.

Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-01-11 15:44:25 -05:00
John Harrison
afd088ac05 drm/i915/guc: Improve GuC loading status check/error reports
If the GuC fails to load, it is useful to know what firmware file /
version was attempted. So move the version info report to before the
load attempt rather than only after a successful load.

If the GuC does fail to load, then make the error messages visible
rather than being 'debug' prints that do not appears in dmesg output
by default.

When waiting for the GuC to load, it used to be necessary to check for
two different states - READY and (LAPIC_DONE | MIA_CORE). Apparently
the second signified init complete on RC6 exit. However, in more
recent GuC versions the RC6 exit sequence now finishes with status
READY as well. So the test can be simplified.

Also, add an enum giving all the current status codes that GuC loading
can report as a reference without having to pull and search through
the GuC source files.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-4-John.C.Harrison@Intel.com
2022-01-11 11:47:43 -08:00
John Harrison
77b6f79df6 drm/i915/guc: Update to GuC version 69.0.3
Update to the latest GuC release.

The latest GuC firmware introduces a number of interface changes:

GuC may return NO_RESPONSE_RETRY message for requests sent over CTB.
Add support for this reply and try resending the request again as a
new CTB message.

A KLV (key-length-value) mechanism is now used for passing
configuration data such as CTB management.

With the new KLV scheme, the old CTB management actions are no longer
used and are removed.

Register capture on hang is now supported by GuC. Full i915 support
for this will be added by a later patch. A minimum support of
providing capture memory and register lists is required though, so add
that in.

The device id of the current platform needs to be provided at init time.

The 'poll CS' w/a (Wa_22012773006) was blanket enabled by previous
versions of GuC. It must now be explicitly requested by the KMD. So,
add in the code to turn it on when relevant.

The GuC log entry format has changed. This requires adding a new field
to the log header structure to mark the wrap point at the end of the
buffer (as the buffer size is no longer a multiple of the log entry
size).

New CTB notification messages are now sent for some things that were
previously only sent via MMIO notifications.

Of these, the crash dump notification was not really being handled by
i915. It called the log flush code but that only flushed the regular
debug log and then only if relay logging was enabled. So just report
an error message instead.

The 'exception' notification was just being ignored completely. So add
an error message for that as well.

Note that in either the crash dump or the exception case, the GuC is
basically dead. The KMD will detect this via the heartbeat and trigger
both an error log (which will include the crash dump as part of the
GuC log) and a GT reset. So no other processing is really required.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-3-John.C.Harrison@Intel.com
2022-01-11 11:47:36 -08:00
John Harrison
53c8283ba9 drm/i915/guc: Temporarily bump the GuC load timeout
There is a known (but exceedingly unlikely) race condition where the
asynchronous frequency management code could reduce the GT clock while
a GuC reload is in progress (during a full GT reset). A fix is in
progress but there are complex locking issues to be resolved. In the
meantime bump the timeout to 200ms. Even at slowest clock, this
should be sufficient. And in the working case, a larger timeout makes
no difference.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220107000622.292081-2-John.C.Harrison@Intel.com
2022-01-11 11:47:35 -08:00
Liu Ying
69e630016e drm/atomic: Check new_crtc_state->active to determine if CRTC needs disable in self refresh mode
Actual hardware state of CRTC is controlled by the member 'active' in
struct drm_crtc_state instead of the member 'enable', according to the
kernel doc of the member 'enable'.  In fact, the drm client modeset
and atomic helpers are using the member 'active' to do the control.

Referencing the member 'enable' of new_crtc_state, the function
crtc_needs_disable() may fail to reflect if CRTC needs disable in
self refresh mode, e.g., when the framebuffer emulation will be blanked
through the client modeset helper with the next commit, the member
'enable' of new_crtc_state is still true while the member 'active' is
false, hence the relevant potential encoder and bridges won't be disabled.

So, let's check new_crtc_state->active to determine if CRTC needs disable
in self refresh mode instead of new_crtc_state->enable.

Fixes: 1452c25b0e ("drm: Add helpers to kick off self refresh mode in drivers")
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211230040626.646807-1-victor.liu@nxp.com
2022-01-11 10:37:15 -05:00
Thomas Hellström
60dc43d119 drm/i915: Use struct vma_resource instead of struct vma_snapshot
There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.

So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.

This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.

v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-7-thomas.hellstrom@linux.intel.com
2022-01-11 10:54:11 +01:00
Thomas Hellström
950505cabe drm/i915: Asynchronous migration selftest
Add a selftest to exercise asynchronous migration and -unbining.
Extend the gem_migrate selftest to perform the migrations while
depending on a spinner and a bound vma set up on the migrated
buffer object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-6-thomas.hellstrom@linux.intel.com
2022-01-11 10:54:11 +01:00
Thomas Hellström
2f6b90da91 drm/i915: Use vma resources for async unbinding
Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.

Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.

v2:
- Use a separate kmem_cache for vma resources for now to isolate their
  memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
  of whether the vm is closed, we need the unbind fence to properly wait
  for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
  pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
  async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
- Don't allow async unbinding if the vma_res pages are not the same as
  the object pages. (Matthew Auld)
v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
2022-01-11 10:54:11 +01:00
Thomas Hellström
ebf3c361f4 drm/i915: Don't pin the object pages during pending vma binds
A pin-count is already held by vma->pages so taking an additional pin
during async binds is not necessary.

When we introduce async unbinding we have other means of keeping the
object pages alive.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-4-thomas.hellstrom@linux.intel.com
2022-01-11 10:54:11 +01:00
Thomas Hellström
39a2bd34c9 drm/i915: Use the vma resource as argument for gtt binding / unbinding
When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt
v6:
- Remove I915_VMA_ALLOC_BIT (Matthew Auld)
- Change some members of struct i915_vma_resource from unsigned long to u64
  (Matthew Auld)
v7:
- Fix vma resource size parameters to be u64 rather than unsigned long
  (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
2022-01-11 10:53:14 +01:00
Miaoqian Lin
c71af3dae3 drm/sun4i: dw-hdmi: Fix missing put_device() call in sun8i_hdmi_phy_get
The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore.
Add the corresponding 'put_device()' in the error handling path.

Fixes: 9bf3797796 ("drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220107083633.20843-1-linmq006@gmail.com
2022-01-11 10:49:32 +01:00
Thomas Hellström
e1a4bbb6e8 drm/i915: Initial introduction of vma resources
Introduce vma resources, sort of similar to TTM resources,  needed for
asynchronous bind management. Initially we will use them to hold
completion of unbinding when we capture data from a vma, but they will
be used extensively in upcoming patches for asynchronous vma unbinding.

v6:
- Some documentation updates

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-2-thomas.hellstrom@linux.intel.com
2022-01-11 09:01:37 +01:00
Matthew Auld
6ef295e342 drm/i915/ttm: ensure we unmap when purging
Purging can happen during swapping out, or directly invoked with the
madvise ioctl. In such cases this doesn't involve a ttm move, which
skips umapping the object.

v2(Thomas):
- add ttm_truncate helper, and just call into i915_ttm_move_notify() to
  handle the unmapping step

Fixes: 213d509277 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-4-matthew.auld@intel.com
(cherry picked from commit ab4911b7d4)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 14:00:47 +00:00
Matthew Auld
8ee262ba79 drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages.
Importantly this should now handle the ttm swapping case, and all other
places that already call into i915_ttm_move_notify().

v2: fix up the selftest

Fixes: cf3e3e86d7 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-3-matthew.auld@intel.com
(cherry picked from commit 903e038727)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 13:55:07 +00:00
Matthew Auld
03ee595678 drm/i915/ttm: only fault WILLNEED objects
Don't attempt to fault and re-populate purged objects. By some fluke
this passes the dontneed-after-mmap IGT, but for the wrong reasons.

Fixes: cf3e3e86d7 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-2-matthew.auld@intel.com
(cherry picked from commit f3cb4a2de5)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 13:55:02 +00:00
Matthew Auld
4c2602ba8d drm/i915: don't call free_mmap_offset when purging
The TTM backend is in theory the only user here(also purge should only
be called once we have dropped the pages), where it is setup at object
creation and is only removed once the object is destroyed. Also
resetting the node here might be iffy since the ttm fault handler
uses the stored fake offset to determine the page offset within the pages
array.

This also blows up in the dontneed-before-mmap test, since the
expectation is that the vma_node will live on, until the object is
destroyed:

<2> [749.062902] kernel BUG at drivers/gpu/drm/i915/gem/i915_gem_ttm.c:943!
<4> [749.062923] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4> [749.062928] CPU: 0 PID: 1643 Comm: gem_madvise Tainted: G     U  W         5.16.0-rc8-CI-CI_DRM_11046+ #1
<4> [749.062933] Hardware name: Gigabyte Technology Co., Ltd. GB-Z390 Garuda/GB-Z390 Garuda-CF, BIOS IG1c 11/19/2019
<4> [749.062937] RIP: 0010:i915_ttm_mmap_offset.cold.35+0x5b/0x5d [i915]
<4> [749.063044] Code: 00 48 c7 c2 a0 23 4e a0 48 c7 c7 26 df 4a a0 e8 95 1d d0 e0 bf 01 00 00 00 e8 8b ec cf e0 31 f6 bf 09 00 00 00 e8 5f 30 c0 e0 <0f> 0b 48 c7 c1 24 4b 56 a0 ba 5b 03 00 00 48 c7 c6 c0 23 4e a0 48
<4> [749.063052] RSP: 0018:ffffc90002ab7d38 EFLAGS: 00010246
<4> [749.063056] RAX: 0000000000000240 RBX: ffff88811f2e61c0 RCX: 0000000000000006
<4> [749.063060] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
<4> [749.063063] RBP: ffffc90002ab7e58 R08: 0000000000000001 R09: 0000000000000001
<4> [749.063067] R10: 000000000123d0f8 R11: ffffc90002ab7b20 R12: ffff888112a1a000
<4> [749.063071] R13: 0000000000000004 R14: ffff88811f2e61c0 R15: ffff888112a1a000
<4> [749.063074] FS:  00007f6e5fcad500(0000) GS:ffff8884ad600000(0000) knlGS:0000000000000000
<4> [749.063078] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [749.063081] CR2: 00007efd264e39f0 CR3: 0000000115fd6005 CR4: 00000000003706f0
<4> [749.063085] Call Trace:
<4> [749.063087]  <TASK>
<4> [749.063089]  __assign_mmap_offset+0x41/0x300 [i915]
<4> [749.063171]  __assign_mmap_offset_handle+0x159/0x270 [i915]
<4> [749.063248]  ? i915_gem_dumb_mmap_offset+0x70/0x70 [i915]
<4> [749.063325]  drm_ioctl_kernel+0xae/0x140
<4> [749.063330]  drm_ioctl+0x201/0x3d0
<4> [749.063333]  ? i915_gem_dumb_mmap_offset+0x70/0x70 [i915]
<4> [749.063409]  ? do_user_addr_fault+0x200/0x670
<4> [749.063415]  __x64_sys_ioctl+0x6d/0xa0
<4> [749.063419]  do_syscall_64+0x3a/0xb0
<4> [749.063423]  entry_SYSCALL_64_after_hwframe+0x44/0xae
<4> [749.063428] RIP: 0033:0x7f6e5f100317

Testcase: igt/gem_madvise/dontneed-before-mmap
Fixes: cf3e3e86d7 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-1-matthew.auld@intel.com
(cherry picked from commit 658a0c6326)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 13:54:57 +00:00