Commit Graph

20427 Commits

Author SHA1 Message Date
Charlene Liu
1f6c9ab06f drm/amd/display: remove dmcub_support cap dependency
[why]
matching the dmcub_support with all other dcn version.

Reviewed-by: Sung joon Kim <Sungjoon.Kim@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:05 -05:00
Mikita Lipski
f0d0c39149 drm/amd/display: Pass panel inst to a PSR command
[why]
PSR set power command wasn't setting panel instance
and command version which caused both streams
to overwrite the same PSR state.
[how]
Pass panel instance to the set power command function
and to DMUB and set command version enum

Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Mikita Lipski <mikita.lipski@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:05 -05:00
Leo (Hanghong) Ma
ebd1e71969 drm/amd/display: Add helper for blanking all dp displays
[Why & How]
1. The code to blank all dp display have been called many times,
so add helpers in dc_link to make it more concise.
2. Add some check to fix the dmesg errors at boot and resume from S3
on dcn3.1 during DQE's promotion test.

Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:05 -05:00
Ye Guojin
b97788e504 drm/amd/display: remove unnecessary conditional operators
Since the variables named is_end_of_payload and hpd_status are already
bool type, the ?: conditional operator is unnecessary any more.

Clean them up here.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:04 -05:00
Nirmoy Das
26db557e35 drm/amdgpu: return early on error while setting bar0 memtype
We set WC memtype for aper_base but don't check return value
of arch_io_reserve_memtype_wc(). Be more defensive and return
early on error.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:04 -05:00
Nirmoy Das
d5a28852e8 drm/amdgpu: remove unnecessary checks
amdgpu_ttm_backend_bind() only needed for TTM_PL_TT
and AMDGPU_PL_PREEMPT.

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:04 -05:00
Felix Kuehling
b5f5738480 drm/amdkfd: Add sysfs bitfields and enums to uAPI
These bits are de-facto part of the uAPI, so declare them in a uAPI header.

The corresponding bit-fields and enums in user mode are defined in
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/master/include/hsakmttypes.h

HSA_CAP_...           -> HSA_CAPABILITY
HSA_MEM_HEAP_TYPE_... -> HSA_HEAPTYPE
HSA_MEM_FLAGS_...     -> HSA_MEMORYPROPERTY
HSA_CACHE_TYPE_...    -> HsaCacheType
HSA_IOLINK_TYPE_...   -> HSA_IOLINKTYPE
HSA_IOLINK_FLAGS_...  -> HSA_LINKPROPERTY

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Evan Quan
087451f372 drm/amdgpu: use generic fb helpers instead of setting up AMD own's.
With the shadow buffer support from generic framebuffer emulation, it's
possible now to have runpm kicked when no update for console.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Graham Sider
b5d1d755c1 drm/amdkfd: remove kgd_dev declaration and initialization
Completes removal of kgd_dev. Direct references to amdgpu_device objects
should now be used instead.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:03 -05:00
Graham Sider
56c5977eae drm/amdkfd: replace/remove remaining kgd_dev references
Remove get_amdgpu_device and other remaining kgd_dev references aside
from declaration/kfd struct entry and initialization.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
dff63da93e drm/amdkfd: replace kgd_dev in gpuvm amdgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_gpuvm_acquire_process_vm
- amdgpu_amdkfd_gpuvm_release_process_vm
- amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu
- amdgpu_amdkfd_gpuvm_free_memory_of_gpu
- amdgpu_amdkfd_gpuvm_map_memory_to_gpu
- amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu
- amdgpu_amdkfd_gpuvm_sync_memory
- amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel
- amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel
- amdgpu_amdkfd_gpuvm_get_vm_fault_info
- amdgpu_amdkfd_gpuvm_import_dmabuf
- amdgpu_amdkfd_get_tile_config

Removed:

- get_amdgpu_device

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
574c4183ef drm/amdkfd: replace kgd_dev in get amdgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_get_fw_version
- amdgpu_amdkfd_get_local_mem_info
- amdgpu_amdkfd_get_gpu_clock_counter
- amdgpu_amdkfd_get_max_engine_clock_in_mhz
- amdgpu_amdkfd_get_cu_info
- amdgpu_amdkfd_get_dmabuf_info
- amdgpu_amdkfd_get_vram_usage
- amdgpu_amdkfd_get_hive_id
- amdgpu_amdkfd_get_unique_id
- amdgpu_amdkfd_get_mmio_remap_phys_addr
- amdgpu_amdkfd_get_num_gws
- amdgpu_amdkfd_get_asic_rev_id
- amdgpu_amdkfd_get_noretry
- amdgpu_amdkfd_get_xgmi_hops_count
- amdgpu_amdkfd_get_xgmi_bandwidth_mbytes
- amdgpu_amdkfd_get_pcie_bandwidth_mbytes

Also replaces kfd_device_by_kgd with kfd_device_by_adev, now
searching via adev rather than kgd.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:02 -05:00
Graham Sider
6bfc7c7e17 drm/amdkfd: replace kgd_dev in various amgpu_amdkfd funcs
Modified definitions:

- amdgpu_amdkfd_submit_ib
- amdgpu_amdkfd_set_compute_idle
- amdgpu_amdkfd_have_atomics_support
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_flush_gpu_tlb_pasid
- amdgpu_amdkfd_gpu_reset
- amdgpu_amdkfd_alloc_gtt_mem
- amdgpu_amdkfd_free_gtt_mem
- amdgpu_amdkfd_alloc_gws
- amdgpu_amdkfd_free_gws
- amdgpu_amdkfd_ras_poison_consumption_handler

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider
3356c38dc1 drm/amdkfd: replace kgd_dev in various kfd2kgd funcs
Modified definitions:

- program_sh_mem_settings
- set_pasid_vmid_mapping
- init_interrupts
- address_watch_disable
- address_watch_execute
- wave_control_execute
- address_watch_get_offset
- get_atc_vmid_pasid_mapping_info
- set_scratch_backing_va
- set_vm_context_page_table_base
- read_vmid_from_vmfault_reg
- get_cu_occupancy
- program_trap_handler_settings

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider
420185fdad drm/amdkfd: replace kgd_dev in hqd/mqd kfd2kgd funcs
Modified definitions:

- hqd_load
- hiq_mqd_load
- hqd_sdma_load
- hqd_dump
- hqd_sdma_dump
- hqd_is_occupied
- hqd_destroy
- hqd_sdma_is_occupied
- hqd_sdma_destroy

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:01 -05:00
Graham Sider
c531a58bb6 drm/amdkfd: replace kgd_dev in static gfx v10_3 funcs
Static funcs in amdgpu_amdkfd_gfx_v10_3.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider
4056b03377 drm/amdkfd: replace kgd_dev in static gfx v10 funcs
Static funcs in amdgpu_amdkfd_gfx_v10.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider
9a17c9b79b drm/amdkfd: replace kgd_dev in static gfx v9 funcs
Static funcs in amdgpu_amdkfd_gfx_v9.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:58:00 -05:00
Graham Sider
1cca608742 drm/amdkfd: replace kgd_dev in static gfx v8 funcs
Static funcs in amdgpu_amdkfd_gfx_v8.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Graham Sider
9365fbf3d7 drm/amdkfd: replace kgd_dev in static gfx v7 funcs
Static funcs in amdgpu_amdkfd_gfx_v7.c now using amdgpu_device.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Graham Sider
c6c5744638 drm/amdkfd: add amdgpu_device entry to kfd_dev
Patch series to remove kgd_dev struct and replace all instances with
amdgpu_device objects.

amdgpu_device needs to be declared in kgd_kfd_interface.h to be visible
to kfd2kgd_calls.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-17 16:57:59 -05:00
Linus Torvalds
304ac8032d Merge tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm
Pull more drm updates from Dave Airlie:
 "I missed a drm-misc-next pull for the main pull last week. It wasn't
  that major and isn't the bulk of this at all. This has a bunch of
  fixes all over, a lot for amdgpu and i915.

  bridge:
   - HPD improvments for lt9611uxc
   - eDP aux-bus support for ps8640
   - LVDS data-mapping selection support

  ttm:
   - remove huge page functionality (needs reworking)
   - fix a race condition during BO eviction

  panels:
   - add some new panels

  fbdev:
   - fix double-free
   - remove unused scrolling acceleration
   - CONFIG_FB dep improvements

  locking:
   - improve contended locking logging
   - naming collision fix

  dma-buf:
   - add dma_resv_for_each_fence iterator
   - fix fence refcounting bug
   - name locking fixesA

  prime:
   - fix object references during mmap

  nouveau:
   - various code style changes
   - refcount fix
   - device removal fixes
   - protect client list with a mutex
   - fix CE0 address calculation

  i915:
   - DP rates related fixes
   - Revert disabling dual eDP that was causing state readout problems
   - put the cdclk vtables in const data
   - Fix DVO port type for older platforms
   - Fix blankscreen by turning DP++ TMDS output buffers on encoder->shutdown
   - CCS FBs related fixes
   - Fix recursive lock in GuC submission
   - Revert guc_id from i915_request tracepoint
   - Build fix around dmabuf

  amdgpu:
   - GPU reset fix
   - Aldebaran fix
   - Yellow Carp fixes
   - DCN2.1 DMCUB fix
   - IOMMU regression fix for Picasso
   - DSC display fixes
   - BPC display calculation fixes
   - Other misc display fixes
   - Don't allow partial copy from user for DC debugfs
   - SRIOV fixes
   - GFX9 CSB pin count fix
   - Various IP version check fixes
   - DP 2.0 fixes
   - Limit DCN1 MPO fix to DCN1

  amdkfd:
   - SVM fixes
   - Fix gfx version for renoir
   - Reset fixes

  udl:
   - timeout fix

  imx:
   - circular locking fix

  virtio:
   - NULL ptr deref fix"

* tag 'drm-next-2021-11-12' of git://anongit.freedesktop.org/drm/drm: (126 commits)
  drm/ttm: Double check mem_type of BO while eviction
  drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
  drm/amdgpu: drop jpeg IP initialization in SRIOV case
  drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
  drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
  drm/amd/display: Don't lock connection_mutex for DMUB HPD
  drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
  drm/amdkfd: Fix retry fault drain race conditions
  drm/amdkfd: lower the VAs base offset to 8KB
  drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
  drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
  drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
  drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages
  drm/i915/fb: Fix rounding error in subsampled plane size calculation
  drm/i915/hdmi: Turn DP++ TMDS output buffers back on in encoder->shutdown()
  drm/locking: fix __stack_depot_* name conflict
  drm/virtio: Fix NULL dereference error in virtio_gpu_poll
  drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
  drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
  drm/amd/amdkfd: Don't sent command to HWS on kfd reset
  ...
2021-11-12 12:11:07 -08:00
Dave Airlie
447212bb4f BackMerge tag 'v5.15' into drm-next
I got a drm-fixes which had some 5.15 stuff in it, so to avoid
the mess just backmerge here.

Linux 5.15

Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-11-12 09:23:16 +10:00
Alistair Popple
ab09243aa9 mm/migrate.c: remove MIGRATE_PFN_LOCKED
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect().  If it
wasn't then the a second attempt is made to lock the page.  However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity.  So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-11 09:34:35 -08:00
Dave Airlie
951bad0bd9 Merge tag 'amd-drm-fixes-5.16-2021-11-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-fixes-5.16-2021-11-10:

amdgpu:
- Don't allow partial copy from user for DC debugfs
- SRIOV fixes
- GFX9 CSB pin count fix
- Various IP version check fixes
- DP 2.0 fixes
- Limit DCN1 MPO fix to DCN1

amdkfd:
- SVM fixes
- Reset fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110222536.7527-1-alexander.deucher@amd.com
2021-11-11 10:14:44 +10:00
Dave Airlie
f8ca7b7419 Merge tag 'drm-misc-next-fixes-2021-11-10' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Removed the TTM Huge Page functionnality to address a crash, a timeout
fix for udl, CONFIG_FB dependency improvements, a fix for a circular
locking depency in imx, a NULL pointer dereference fix for virtio, and a
naming collision fix for drm/locking.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110082114.vfpkpnecwdfg27lk@gilmour
2021-11-11 08:14:19 +10:00
Guchun Chen
4d395f938a drm/amdgpu: add missed support for UVD IP_VERSION(3, 0, 64)
Fixes: 96b8dd4423 ("drm/amdgpu/amdgpu_vcn: convert to IP version checking")
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:03:41 -05:00
Guchun Chen
b45a36032d drm/amdgpu: drop jpeg IP initialization in SRIOV case
Fixes: b05b9c591f ("drm/amdgpu: clean up set IP function")
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 12:00:08 -05:00
Shirish S
4375d6255d drm/amd/display: reject both non-zero src_x and src_y only for DCN1x
[Why]
Video plane gets rejected for non-zero src_y and src_x on DCN2.x.

[How]
Limit the rejection till DCN1.x and verified MPO, by dragging video
playback beyond display's left (0, 0) co-ordinates.

Fixes: d89f6048bd ("drm/amd/display: Reject non-zero src_y and src_x for video planes")
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-10 11:59:34 -05:00
Nicholas Kazlauskas
c40a09e56f drm/amd/display: Add callbacks for DMUB HPD IRQ notifications
[Why]
We need HPD IRQ notifications (RX, short pulse) to properly handle
DP MST for DPIA connections.

[How]
A null pointer exception currently occurs when these are received
so add a check to validate that we have a handler installed for
the notification.

Extend the HPD handler to also handle HPD IRQ (RX) since the logic is
the same.

Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")

Reviewed-by: Wayne Lin <Wayne.Lin@amd.com>
Reviewed-by: Jude Shih <shenshih@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Nicholas Kazlauskas
d82b3266ef drm/amd/display: Don't lock connection_mutex for DMUB HPD
[Why]
Per DRM spec we only need to hold that lock when touching
connector->state - which we do not do in that handler.

Taking this locking introduces unnecessary dependencies with other
threads which is bad for performance and opens up the potential for
a deadlock since there are multiple locks being held at once.

[How]
Remove the connection_mutex lock/unlock routine and just iterate over
the drm connectors normally. The iter helpers implicitly lock the
connection list so this is safe to do.

DC link access also does not need to be guarded since the link
table is static at creation - we don't dynamically add or remove links,
just streams.

Fixes: e27c41d5b0 ("drm/amd/display: Support for DMUB HPD interrupt handling")

Reviewed-by: Jude Shih <shenshih@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Anson Jacob
433e5dec41 drm/amd/display: Add comment where CONFIG_DRM_AMD_DC_DCN macro ends
Trivial patch which adds a comment for macro
endif's in amdgpu_dm.c

Reviewed-by: Ariel Bernstein <Eric.Bernstein@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Acked-by: Anson Jacob <Anson.Jacob@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Anson Jacob <Anson.Jacob@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Felix Kuehling
a44fe9ee05 drm/amdkfd: Fix retry fault drain race conditions
The check for whether to drain retry faults must be under the mmap write
lock to serialize with munmap notifier callbacks.

We were also missing checks on child ranges. To fix that, simplify the
logic by using a flag rather than checking on each prange. That also
allows draining less freqeuntly when many ranges are unmapped at once.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
Tested-by: Alex Sierra <Alex.Sierra@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Alex Sierra
3aac6aa630 drm/amdkfd: lower the VAs base offset to 8KB
The low 16MB of virtual address space are currently reserved for kernel
mode allocations mapped into user virtual address space. This causes
conflicts with HMM/SVM mappings at low virtual addresses. We tried to
move those kernel mode allocations to the upper half of the 64-bit
virtual address space for GFX9, which is naturally reserved for kernel
use. However, TBA (trap handler code) has problems to access addresses
in the high virtual space. We have decided to set this to 8KB of the
lower address space as a temporary fix, while investigate TBA address
problem. It is very unlikely for user space to map memory at this low
region.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Shirish S
706bc8c501 drm/amd/display: fix exit from amdgpu_dm_atomic_check() abruptly
make action upon failure in "drm_atomic_add_affected_connectors()"
consistent with the rest of failures in amdgpu_dm_atomic_check().

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
shaoyunl
9f4f2c1a35 drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov
The KFD pre_reset should be called before reset been executed, it will
hold the lock to prevent other rocm process to sent the packlage to hiq
during host execute the real reset on the HW

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:08:00 -05:00
Evan Quan
4fc30ea780 drm/amdgpu: fix uvd crash on Polaris12 during driver unloading
There was a change(below) target for such issue:
d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")
But the fix for VI ASICs was missing there. This is a supplement for
that.

Fixes: d82e2c249c ("drm/amdgpu: Fix crash on device remove/driver unload")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-09 17:06:15 -05:00
Alex Deucher
2d32ffd6e9 drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()
Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not
set.

Fixes: f7f12b2582 ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:53 -04:00
Felix Kuehling
5702d05295 drm/amdgpu: Fix dangling kfd_bo pointer for shared BOs
If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-By: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:45 -04:00
shaoyunl
b8c20c74ab drm/amd/amdkfd: Don't sent command to HWS on kfd reset
When kfd need to be reset, sent command to HWS might cause hang and get unnecessary timeout.
This change try not to touch HW in pre_reset and keep queues to be in the evicted state
when the reset is done, so they are not put back on the runlist. These queues will be destroied
on process termination.

Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:38 -04:00
Evan Quan
e6ef9b396b drm/amdgpu: correctly toggle gfx on/off around RLC_SPM_* register access
As part of the ib padding process, accessing the RLC_SPM_* register may
trigger gfx hang. Since gfxoff may be already kicked during the whole period.
To address that, we manually toggle gfx on/off around the RLC_SPM_*
register access.

This can resolve the gfx hang issue observed on running Talos with RDP launched
in parallel.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:29 -04:00
Tao Zhou
7513c9ff44 drm/amdgpu: correct xgmi ras error count reset
The error count reset for xgmi3x16 pcs is missed.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:12:21 -04:00
Mario Limonciello
c451c979ea drm/amd/pm: Correct DPMS disable IP version check
Previously there was a check based on chip # for chips that aligned to
>=CHIP_NAVI10 to have RLC stopped as part of DPMS check.  This was because
of gfxclk being controlled by RLC in the newer designs.

As part of IP version checking though, this got changed to match IP
version for SMU.  Because Renoir designs also include smu11 that meant
that even GFX9 started to stop RLC earlier.

Adjust to match GFX IP version instead of SMU IP version to restore the
previous behavior.

Fixes: a8967967f6 ("drm/amdgpu/amdgpu_smu: convert to IP version checking")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:43 -04:00
YuBiao Wang
6ddc0eb7a2 drm/amd/amdgpu: Fix csb.bo pin_count leak on gfx 9
[Why]
csb bo is not unpinned in gfx 9. It will lead to pin_count leak on
driver unload.

[How]
Call bo_free_kernel corresponding to bo_create_kernel in
gfx_rlc_init_csb. This will also unify the code path with other gfx
versions.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:32 -04:00
YuBiao Wang
c4fc13b581 drm/amd/amdgpu: Avoid writing GMC registers under sriov in gmc9
[Why]
For Vega10, disabling gart of gfxhub could mess up KIQ and PSP
under sriov mode, and lead to DMAR on host side.

[How]
Skip writing GMC registers under sriov.

Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:20 -04:00
Alex Deucher
e9c76719c1 drm/amdgpu/powerplay: fix sysfs_emit/sysfs_emit_at handling
sysfs_emit and sysfs_emit_at requrie a page boundary
aligned buf address. Make them happy!

v2: fix sysfs_emit -> sysfs_emit_at missed conversions

Cc: Lang Yu <lang.yu@amd.com>
Cc: Darren Powell <darren.powell@amd.com>
Fixes: 6db0c87a0a ("amdgpu/pm: Replace hwmgr smu usage of sprintf with sysfs_emit")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1774
Reviewed-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:14 -04:00
Kent Russell
7ef6b7f844 drm/amdgpu: Make sure to reserve BOs before adding or removing
BOs need to be reserved before they are added or removed, so ensure that
they are reserved during kfd_mem_attach and kfd_mem_detach

Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:11:05 -04:00
Alex Sierra
a6283010e2 drm/amdkfd: avoid recursive lock in migrations back to RAM
[Why]:
When we call hmm_range_fault to map memory after a migration, we don't
expect memory to be migrated again as a result of hmm_range_fault. The
driver ensures that all memory is in GPU-accessible locations so that
no migration should be needed. However, there is one corner case where
hmm_range_fault can unexpectedly cause a migration from DEVICE_PRIVATE
back to system memory due to a write-fault when a system memory page in
the same range was mapped read-only (e.g. COW). Ranges with individual
pages in different locations are usually the result of failed page
migrations (e.g. page lock contention). The unexpected migration back
to system memory causes a deadlock from recursive locking in our
driver.

[How]:
Creating a task reference new member under svm_range_list struct.
Setting this with "current" reference, right before the hmm_range_fault
is called. This member is checked against "current" reference at
svm_migrate_to_ram callback function. If equal, the migration will be
ignored.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:10:58 -04:00
Harry Wentland
25a1a08fe7 drm/amd/display: Don't allow partial copy_from_user
There is no reason to allow for partial buffers from userspace in our
debugfs. In this particular case callers will zero out the wr_buf but if
callers in the future don't do that we might be looking at corrupt data.

Linus puts it better than I can in
https://lkml.org/lkml/2021/10/26/993

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-11-05 14:10:39 -04:00
Jason Gunthorpe
0d97950953 drm/ttm: remove ttm_bo_vm_insert_huge()
The huge page functionality in TTM does not work safely because PUD and
PMD entries do not have a special bit.

get_user_pages_fast() considers any page that passed pmd_huge() as
usable:

	if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) ||
		     pmd_devmap(pmd))) {

And vmf_insert_pfn_pmd_prot() unconditionally sets

	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));

eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE.

As such gup_huge_pmd() will try to deref a struct page:

	head = try_grab_compound_head(pmd_page(orig), refs, flags);

and thus crash.

Thomas further notices that the drivers are not expecting the struct page
to be used by anything - in particular the refcount incr above will cause
them to malfunction.

Thus everything about this is not able to fully work correctly considering
GUP_fast. Delete it entirely. It can return someday along with a proper
PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast.

Fixes: 314b6580ad ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
[danvet: Update subject per Thomas' &Christian's review]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com
2021-11-05 11:13:19 +01:00