Make sure the clockgating feature is supported before action.
Otherwise, the feature may be disabled unexpectedly on enablement
request.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The scratch register should be accessed through MMIO instead of RLCG
in SRIOV, since it being used in RLCG register access function.
Fixes: d54762cc3e ("drm/amdgpu: nuke dynamic gfx scratch reg allocation")
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to the handling of amdgpu_sa_bo_next_hole in commit 6a15f3ff19
("drm/amdgpu: Initialize fences array entries in amdgpu_sa_bo_next_hole"),
we thought a patch might be needed here as well.
The entries were only initialized once in radeon_sa_bo_new. If a fence
wasn't signalled yet in the first radeon_sa_bo_next_hole call, but then
got signalled before a later radeon_sa_bo_next_hole call, it could
destroy the fence but leave its pointer in the array, resulting in
use-after-free in radeon_sa_bo_new.
Signed-off-by: Xiaohui Zhang <xiaohuizhang@ruc.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GCC throw warnings for the function dcn21_update_bw_bounding_box and
dcn316_update_bw_bounding_box due to its frame size that looks like
this:
error: the frame size of 1936 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
For fixing this issue I dropped an intermadiate variable.
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Extend KFD device topology to surface peer-to-peer links among
GPU devices connected over PCIe or xGMI. Enabling HSA_AMD_P2P is
REQUIRED to surface peer-to-peer links.
Prior to this KFD did not expose to user mode any P2P links or
indirect links that go over two or more direct hops. Old versions
of the Thunk used to make up their own P2P and indirect links without
the information about peer-accessibility and chipset support available
to the kernel mode driver. In this patch we expose P2P links in a new
sysfs directory to provide more reliable P2P link information to user
mode.
Old versions of the Thunk will continue to work as before and ignore
the new directory. This avoids conflicts between P2P links exposed by
KFD and P2P links created by the Thunk itself. New versions of the Thunk
will use only the P2P links provided in the new p2p_links directory, if
it exists, or fall back to the old code path on older KFDs that don't
expose p2p_links.
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to the handling of amdgpu_mode_dumb_create in commit 54ef0b5461
("drm/amdgpu: integer overflow in amdgpu_mode_dumb_create()"),
we thought a patch might be needed here as well.
args->size is a u64. arg->pitch and args->height are u32. The
multiplication will overflow instead of using the high 32 bits as
intended.
Signed-off-by: Xiaohui Zhang <xiaohuizhang@ruc.edu.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GCC throw warnings for the function dcn31_update_bw_bounding_box and
dcn316_update_bw_bounding_box due to its frame size that looks like
this:
error: the frame size of 1936 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
For fixing this issue I dropped an intermadiate variable.
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add support for peer-to-peer communication among AMD GPUs over PCIe
bus. Support REQUIRES enablement of config HSA_AMD_P2P.
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Extend current kernel config requirements of amdgpu by adding config
HSA_AMD_P2P. Enabling HSA_AMD_P2P is REQUIRED to support peer-to-peer
communication between AMD GPU devices over PCIe bus
Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
GCC throw warnings for the function dcn20_update_bounding_box due to its
frame size that looks like this:
error: the frame size of 1936 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
This commit fixes this issue by eliminating an intermediary variable
that creates a large array.
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Hamza Mahfooz <hamza.mahfooz@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
All other chips, from gfx6-gfx10, now include the MODE register at the
end of the wave debug state. This appears to have been missed in gfx11,
so this patch adds in MODE to the debug state for gfx11.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since it's inception in 2012 it has been understood that the DRM GEM CMA
helpers do not depend on CMA as the backend allocator. In fact the first
bug fix to ensure the cma-helpers work correctly with an IOMMU backend
appeared in 2014. However currently the documentation for
drm_gem_cma_create() talks about "a contiguous chunk of memory" without
making clear which address space it will be a contiguous part of.
Additionally the CMA introduction is actively misleading because it only
contemplates the CMA backend.
This matters because when the device accesses the bus through an IOMMU
(and don't use the CMA backend) then the allocated memory is contiguous
only in the IOVA space. This is a significant difference compared to the
CMA backend and the behaviour can be a surprise even to someone who does
a reasonable level of code browsing (but doesn't find all the relevant
function pointers ;-) ).
Improve the kernel doc comments accordingly.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608135821.1153346-1-daniel.thompson@linaro.org
A new PVC+DG2 workaround has appeared recently:
- Wa_16015675438
And a couple existing DG2 workarounds have been extended to PVC:
- Wa_14015795083
- Wa_18018781329
Note that Wa_16015675438 asks us to program a register that is in the
0x2xxx range typically associated with the RCS engine, even though PVC
does not have an RCS. By default the GuC will think we've made a
mistake and throw an exception when it sees this register on a CCS
engine's save/restore list, so we need to pass an extra GuC control flag
to tell it that this is expected and not a problem.
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Badal Nilawar <badal.nilawar@intel.com>
Signed-off-by: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220608005108.3717895-1-matthew.d.roper@intel.com
We're not parsing the 5.4 Gbps value for the old eDP fast link
training link rate, nor are we parsing the new fast link training
link rate field. Remedy both.
Also we'll now use the actual link rate instead of the DPCD BW
register value.
Note that we're not even using this information for anything
currently, so should perhaps just nuke it all unless someone
is planning on implementing fast link training finally...
v2: Stop using the DPCD BW values (Jani)
*20 instead of *2 to get the rate in correct units (Jani)
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220602205649.11283-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
All other chips, from gfx6-gfx10, now include the MODE register at the
end of the wave debug state. This appears to have been missed in gfx11,
so this patch adds in MODE to the debug state for gfx11.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>