This is a preparation work to make reserved doorbell index per device,
instead of using a global macro definition. By doing this, we can easily
change doorbell layout for future ASICs while not affecting ASICs in
production.
Signed-off-by: Oak Zeng <ozeng@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Two non-blocking commits in succession can result in a sequence where
the same dc->current_state is queried for both commits.
1. 1st commit -> check -> commit -> swaps atomic state -> queues work
2. 2nd commit -> check -> commit -> swaps atomic state -> queues work
3. 1st commit work finishes
The issue with this sequence is that the same dc->current_state is
read in both atomic checks. If the first commit modifies streams or
planes those will be missing from the dc->current_state for the
second atomic check. This result in many stream and plane errors in
atomic commit tail.
[How]
The driver still needs to track old to new state to determine if the
commit in its current implementation. Updating the dc_state in
atomic tail is wrong since the dc_state swap should be happening as
part of drm_atomic_helper_swap_state *before* the worker queue kicks
its work off.
The simplest replacement for the subclassing (which doesn't properly
manage the old to new atomic state swap) is to use the drm private
object helpers. While some of the dc_state members could be merged
into dm_crtc_state or dm_plane_state and copied over that way it is
easier for now to just treat the whole dc_state structure as a single
private object.
This allows amdgpu_dm to drop the dc->current_state copy from within
atomic check. It's replaced by a copy from the current atomic state
which is propagated correctly for the sequence described above.
Since access to the dm_state private object is now locked this should
also fix issues that could arise if submitting non-blocking commits
from different threads.
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Driver shouldn't try to access any GFX registers until RLC is idle.
During the test, it took 12 seconds for RLC to clear the BUSY bit
in RLC_GPM_STAT register which is un-acceptable for driver.
As per RLC engineer, it would take RLC Ucode less than 10,000 GFXCLK
cycles to finish its critical section. In a lowest 300M enginer clock
setting(default from vbios), 50 us delay is enough.
This commit fix the hang when RLC introduce the work around for XGMI
which requires more cycles to setup more registers than normal
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On icl+ the plane state that gets passed to update_slave() is not
the plane state of the plane we're programming. With NV12 the
plane state would be coming from the master (UV) plane whereas
the plane we're programming is the slave (Y) plane. For that reason
we need to explicitly pass around the slave plane (or we'd have to
otherwise deduce it by checking whether we were called via
.update_plane() or .update_slave()).
In the case of icl_program_input_csc_coeff() it's actually OK to
assume that we are always the master plane because the input CSC
only exists on HDR planes which can never be a slave plane. But
for consistency let's pass in the plane explicitly anyway.
While at it drop the "_coeff" from the function name since it's
kinda redundant, and this makes the name a bit shorter :)
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114210729.16185-14-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
skl+ can go belly up if there are overlapping ddb allocations between
planes. If we could absolutely guarantee that we can perform the atomic
update within a single frame we shouldn't have to worry about this. But
we can't rely on that so let's steal the ddb overlap check trick from
skl_update_crtcs() and apply it to the plane updates. Since each step
of the sequence is free from ddb overlaps we don't have to worry about
a vblank sneaking up on us in the middle of the sequence. The partial
state that gets latched by the hardware will be safe. And unlike
skl_update_crtcs() we don't have to intoduce any extra vblank waits
on account of only having to worry about a single pipe.
v2: Fix typo in commit msg (Matt)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114210729.16185-12-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
On SKL+ the plane WM/BUF_CFG registers are a proper part of each
plane's register set. That means accessing them will cancel any
pending plane update, and we would need a PLANE_SURF register write
to arm the wm/ddb change as well.
To avoid all the problems with that let's just move the wm/ddb
programming into the plane update/disable hooks. Now all plane
registers get written in one (hopefully atomic) operation.
To make that feasible we'll move the plane ddb tracking into
the crtc state. Watermarks were already tracked there.
v2: Rebase due to input CSC
v3: Split out a bunch of junk (Matt)
v4: Add skl_wm_add_affected_planes() to deal with
cursor special case and non-zero wm register reset value
v5: Drop the unrelated for_each_intel_plane_mask() fix (Matt)
Remove the redundant ddb memset() (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #v3
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181127165900.31298-1-ville.syrjala@linux.intel.com
We're going to need access to the new crtc state in ->disable_plane()
for SKL+ wm/ddb programming and pre-skl pipe gamma/csc control. Pass
the crtc state down.
We'll also try to make intel_crtc_disable_planes() do the right
thing as much as it's possible. The fact that we don't have a
separate crtc state for the disabled state when we're going to
re-enable the crtc later means we might end up poking at a few
extra planes in there. But that's harmless. I suppose one might
argue that we wouldn't have to care about proper ddb/wm/csc/gamma
if the pipe is going to permanently disable anyway, but the state
checker probably cares so we should try our best to make sure
everything is programmed correctly even in that case.
v2: Fix the commit message a bit (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114210729.16185-5-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Some observations about the plane registers:
- the control register will self-arm if the plane is not already
enabled, thus we want to write it as close to (or ideally after)
the surface register
- tileoff/linoff/offset/aux_offset are self-arming as well so we want
them close to the surface register as well
- color keying registers we maybe self arming before SKL. Not 100%
sure but we can try to keep them near to the surface register
as well
- chv pipe b csc register are double buffered but self arming so
moving them down a bit
- the rest should be mostly armed by the surface register so we can
safely write them first, and to just for some consistency let's try
to follow keep them in order based on the register offset
None of this will have any effect of course unless the vblank evasion
fails (which it still does sometimes). Another potential future benefit
might be pulling the non-self armings registers outside the vblank
evasion since they won't latch until the arming register has been
written. This would make the critical section a bit lighter and thus
less likely to exceed the deadline.
v2: Rebase due to input CSC
v3: Swap LINOFF/TILEOFF and KEYMSK/KEYMAX to actually follow
the last rule above (Matt)
Add a bit more rationale to the commit message (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114210729.16185-2-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Don't bounce back to the root level for fragment processing, because
huge pages are not supported at that level. This is unlikely to happen
with the default VM size on Vega, but can be exposed by limiting the
VM size with the amdgpu.vm_size module parameter.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
According to Display Stream compression spec 1.2, the picture
parameter set metadata is sent from source to sink device
using the DP Secondary data packet. An infoframe is formed
for the PPS SDP header and PPS SDP payload bytes.
This patch adds helpers to fill the PPS SDP header
and PPS SDP payload according to the DSC 1.2 specification.
v7:
* Use BUILD_BUG_ON() to protect changing struct size (Ville)
* Remove typecaseting (Ville)
* Include byteorder.h in drm_dsc.c (Ville)
* Correct kernel doc spacing (Anusha)
v6:
* Use proper sequence points for breaking down the
assignments (Chris Wilson)
* Use SPDX identifier
v5:
Do not use bitfields for DRM structs (Jani N)
v4:
* Use DSC constants for params that dont change across
configurations
v3:
* Add reference to added kernel-docs in
Documentation/gpu/drm-kms-helpers.rst (Daniel Vetter)
v2:
* Add EXPORT_SYMBOL for the drm functions (Manasi)
Cc: dri-devel@lists.freedesktop.org
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Acked-by: Sean Paul <seanpaul@chromium.org> (For merging through
drm-intel)
Link: https://patchwork.freedesktop.org/patch/msgid/20181127214125.17658-5-manasi.d.navare@intel.com
This change is an attempt to handle the alternate clock for the CEA mode.
60Hz vs. 59.94Hz, 30Hz vs 29.97Hz or 24Hz vs 23.97Hz on the Amlogic Meson SoC
DRM Driver pixel clock generation.
The actual clock generation will be moved to the Common Clock framework once
all the video clock are handled by the Amlogic Meson SoC clock driver,
then these alternate timings will be handled in the same time in a cleaner
fashion.
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Maxime Jourdan <mjourdan@baylibre.com>
[narmstrong: fix maybe-uninitialized warnings after applying]
Link: https://patchwork.freedesktop.org/patch/msgid/1541501675-3928-1-git-send-email-narmstrong@baylibre.com
The frontend comes with two "channels", that can be configured
independently. When used in YUV mode, the first channel (CH0) represents
the luminance component while the second channel (CH1) represents the
chrominance. In RGB mode, both have to be configured the same way.
Use variables (with the YUV terminology) for each channel's
dimensions, calculating the chroma dimensions from the luma dimensions
and the sub-sampling factors from the format description.
Since the configured size only has pixel precision, the fractional
fixed-point part of the source size is dropped for both components to
ensure that the scaling factors are accurate.
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181123092515.2511-26-paul.kocialkowski@bootlin.com
Before this patch, it is assumed that a plane is supported either
through the frontend or through the backend alone. However, the DRM
interface does not allow finely reporting our hardware capabilities
and there are cases where neither are support.
In particular, some plane formats are supported by the backend and not
the frontend, so they can only be supported without scaling.
Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181123092515.2511-8-paul.kocialkowski@bootlin.com