Move most of the pipe+output CSC programming to the
.color_commit_noarm() hook which runs before vblank evasion.
Only PIPE_CSC_MODE (the arming register) needs to remain in
inside the critical section.
A test case that just updates the CTM in a loop produces
the following i915_update_info numbers on ilk (w/o lockdep):
old new
Updates: 10012 Updates: 10008
| |
1us |** 1us |**********
|************* |*************
4us |********* 4us |*
|* |**
16us | 16us |
| |
66us | 66us |
| |
262us | 262us |
| |
1ms | 1ms |
| |
4ms | 4ms |
| |
17ms | 17ms |
| |
Min update: 1345ns Min update: 1268ns
Max update: 16672ns Max update: 15656ns
Average update: 3914ns Average update: 2185ns
Overruns > 100us: 0 Overruns > 100us: 0
And here is tgl (forced to update both pipe CSC and
output CSC, and with lockdep enabled):
old new
Updates: 10012 Updates: 10012
| |
1us | 1us |
| |
4us |* 4us |**
|** |**********
16us |************* 16us |*************
|* |
66us | 66us |
| |
262us | 262us |
| |
1ms | 1ms |
| |
4ms | 4ms |
| |
17ms | 17ms |
| |
Min update: 5204ns Min update: 5176ns
Max update: 176038ns Max update: 186685ns
Average update: 23931ns Average update: 16654ns
Overruns > 250us: 0 Overruns > 250us: 0
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220224165103.15682-5-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
All the skl+ scaler registers are suitably confined to their own
cachelines so we don't need the uncore.lock to globally serialize
access to these registers. We actually already dropped some of this
in commit 14ad15296d ("drm/i915: Make skl+ universal plane
registers unlocked") as the plane scaler enabling/reconfiguration
became lockless. So let's complete that and remove the rest of
the locks from the scaler programming as well.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220224165103.15682-2-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
At least some DELL monitors (P2715Q) with DPCD_REV 1.2 return corrupted
DPCD register values when reading from the 0xF0000- LTTPR range with an
AUX transaction block size bigger than 1. The DP standard requires 0 to
be returned - as for any other reserved/invalid addresses - but these
monitors return the DPCD_REV register value repeated in each byte of the
read buffer. This will in turn corrupt the values returned by the LTTPRs
between the source and the monitor: LTTPRs must adjust the values they
read from the downstream DPRX, for instance right-shift/init the
downstream DP_PHY_REPEATER_CNT value. Since the value returned by the
monitor's DPRX is non-zero the adjusted values will be corrupt.
Reading the LTTPR registers one-by-one instead of reading all of them
with a single AUX transfer works around the issue.
According to the DP standard's 0xF0000 register description:
"LTTPR-related registers at DPCD Addresses F0000h through F02FFh are
valid only for DPCD r1.4 (or higher)." While it's unclear if DPCD r1.4
refers to the DPCD_REV or to the
LT_TUNABLE_PHY_REPEATER_FIELD_DATA_STRUCTURE_REV register (tickets filed
at the VESA site to clarify this haven't been addressed), one
possibility is that it's a restriction due to non-compliant monitors
described above. Disabling the non-transparent LTTPR mode for all such
monitors is not a viable solution: the transparent LTTPR mode has its
own issue causing link training failures and this would affect a lot of
monitors in use with DPCD_REV < 1.4. Instead this patch works around
the problem by reading the LTTPR common and PHY cap registers one-by-one
for any monitor with a DPCD_REV < 1.4.
The standard requires the DPCD capabilities to be read after the LTTPR
common capabilities are read, so re-read the DPCD capabilities after
the LTTPR common and PHY caps were read out.
v2:
- Use for instead of a while loop. (Ville)
- Add to code comment the monitor model with the problem.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/4531
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220322143844.42616-1-imre.deak@intel.com
Make the dbuf bandwidth min cdclk calculations match the spec
more closely. Supposedly the arbiter can only guarantee an equal
share of the total bandwidth of the slice to each active plane
on that slice. So we take the max bandwidth of any of the planes
on each slice and multiply that by the number of active planes
on the slice to get a worst case estimate on how much bandwidth
we require.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303191207.27931-9-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
There's really no need to maintain these total[] arrays to
track the size of each plane's ddb allocation. We just stick
the results straight into the crtc_state ddb tracking structures.
The main annoyance with all this is the mismatch between
wm_uv vs. ddb_y on pre-icl. If only the hw was consistent in
what it considers the primary source of information we could
avoid some of the uglyness. But since that is not the case
we need a bit of special casing for planar formats.
v2: Keep the ddb entry zeroed when the plane is disabled
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303191207.27931-5-ville.syrjala@linux.intel.com
Handle the plane relative data rate in exactly the same
way as we already handle the real data rate. Ie. pre-calculate
it during intel_plane_atomic_check_with_state(), and assign/clear
it for the Y plane as needed. This should guarantee that the
tracking is 100% consistent, and makes me have to think less
when the same apporach is used by both types of data rate.
We might even want to consider replacing the relative
data rate with the real data rate entirely, but it's not
clear if that will produce less optimal plane ddb
allocations. So for now lets keep using the current approach.
v2: Rebase due to async flip wm optimization
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303191207.27931-4-ville.syrjala@linux.intel.com
Let's store the plane allocation in a manner which more closely
matches how the hw operates. That is, we store the packed/CbCr
ddb in one struct, and the Y ddb in another. Currently we're
storing packed/Y in one struct, CbCr in the other.
This also works pretty well for icl+ where the UV plane is
the main plane and the Y plane is subservient to it. Although
in this case we do not even use ddb_y as we do the ddb allocation
in terms of hw planes.
v2: Rebase
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303191207.27931-2-ville.syrjala@linux.intel.com
For modern platforms the spec explicitly states that a
SAGV block time of zero means that SAGV is not supported.
Let's extend that to all platforms. Supposedly there should
be no systems where this isn't true, and it'll allow us to:
- use the same code regardless of older vs. newer platform
- wm latencies already treat 0 as disabled, so this fits well
with other related code
- make it a bit more clear when SAGV is used vs. not
- avoid overflows from adding U32_MAX with a u16 wm0 latency value
which could cause us to miscalculate the SAGV watermarks on tgl+
Cc: stable@vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220309164948.10671-2-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Let's just do a full DRRS disable/enable across all pipe updates.
This guarantees that the DRRS work doesn't interfere with anything
while the atomic commit is busy reprogramming the pipe.
Needed so that we can start reprogramming M/N seamlessly during
fastsets whenever possible. Also avoids the pre-bdw DRRS PIPECONF
rmw racing with the potential PIPECONF write from the atomic
commit (eg. due to GAMMA_MODE changes).
v2: Include has_drrs in state dump (José)
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220315213944.17132-1-ville.syrjala@linux.intel.com
With static DRRS the user might ask for the lowest possible refresh
rate of the panel, in which case we're not going to find a suitable
downclock mode for it and we should not try to enable seamless DRRS.
This will in fact oops.
We used to check for the presence of the downclock mode here, but
that got removed in commit f0a57798fb ("drm/i915: Introduce
intel_panel_drrs_type()") as redundant (which it was at the time).
But we do need the check again now that static DRRS is a thing.
I must have not re-tested static DRRS fully after introducing
intel_panel_drrs_type() :/
Fixes: c5ee23437c ("drm/i915: Implement static DRRS")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220315132752.11849-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
struct drm_display_mode embeds a list head, so overwriting
the full struct with another one will corrupt the list
(if the destination mode is on a list). Use drm_mode_copy()
instead which explicitly preserves the list head of
the destination mode.
Even if we know the destination mode is not on any list
using drm_mode_copy() seems decent as it sets a good
example. Bad examples of not using it might eventually
get copied into code where preserving the list head
actually matters.
Obviously one case not covered here is when the mode
itself is embedded in a larger structure and the whole
structure is copied. But if we are careful when copying
into modes embedded in structures I think we can be a
little more reassured that bogus list heads haven't been
propagated in.
@is_mode_copy@
@@
drm_mode_copy(...)
{
...
}
@depends on !is_mode_copy@
struct drm_display_mode *mode;
expression E, S;
@@
(
- *mode = E
+ drm_mode_copy(mode, &E)
|
- memcpy(mode, E, S)
+ drm_mode_copy(mode, E)
)
@depends on !is_mode_copy@
struct drm_display_mode mode;
expression E;
@@
(
- mode = E
+ drm_mode_copy(&mode, &E)
|
- memcpy(&mode, E, S)
+ drm_mode_copy(&mode, E)
)
@@
struct drm_display_mode *mode;
@@
- &*mode
+ mode
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218100403.7028-20-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Commit 13ea6db2cf ("drm/i915/edp: Ignore short pulse when panel
powered off") completely broke short pulse handling for eDP as it is
usually generated by sink when it is displaying image and there is
some error or status that source needs to handle.
When power panel is enabled, this state is enough to power aux
transactions and VDD override is disabled, so intel_pps_have_power()
is always returning false causing short pulses to be ignored.
So here better naming this function that intends to check if aux
lines are powered to avoid the endless cycle mentioned in the commit
being fixed and fixing the check for what it is intended.
v2:
- renamed to intel_pps_have_panel_power_or_vdd()
- fixed indentation
Fixes: 13ea6db2cf ("drm/i915/edp: Ignore short pulse when panel powered off")
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311185149.110527-1-jose.souza@intel.com
Let's start supporting static DRRS by trying to match the refresh
rate the user has requested, assuming the panel supports suitable
timings.
For now we stick to just our current two timings:
- fixed_mode: the panel's preferred mode
- downclock_mode: the lowest refresh rate mode we found
Some panels may support more timings than that, but we'll
have to convert our fixed_mode/downclock_mode pointers
into a full list before we can handle that.
v2: Rebase due to intel_panel_get_modes()
Reviewed-by: Jani Nikula <jani.nikula@intel.com> #v1
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311172428.14685-16-ville.syrjala@linux.intel.com