In terms of async flip optimization we don't to allocate
extra ddb space, so lets skip it.
v2: - Extracted min ddb async flip check to separate function
(Ville Syrjälä)
- Used this function to prevent false positive WARN
to be triggered(Ville Syrjälä)
v3: - Renamed dg2_need_min_ddb to need_min_ddb thus making
it more universal.
- Also used DISPLAY_VER instead of IS_DG2(Ville Syrjälä)
- Use rate = 0 instead of just setting extra = 0, thus
letting other planes to use extra ddb and avoiding WARN
(Ville Syrjälä)
v4: - Renamed needs_min_ddb as s/needs/use/ to match
the wm0 counterpart(Ville Syrjälä)
- Added plane->async_flip check to use_min_ddb(now
passing plane as a parameter to do that)(Ville Syrjälä)
- Account for use_min_ddb also when calculating total data rate
(Ville Syrjälä)
v5:
- Use for_each_intel_plane_on_crtc instead of for_each_intel_plane_id
to get plane->async_flip check and account for all planes(Ville Syrjälä)
- Fix line wrapping(Ville Syrjälä)
- Set plane data rate conditionally, avoiding on redundant assignment
(Ville Syrjälä)
- Removed redundant whitespace(Ville Syrjälä)
- Handle use_min_ddb case in skl_plane_relative_data_rate instead of
icl_get_total_relative_data_rate(Ville Syrjälä)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220124090653.14547-2-stanislav.lisovskiy@intel.com
This optimization allows to achieve higher perfomance
during async flips.
For the first async flip we have to still temporarily
switch to sync flip, in order to reprogram plane
watermarks, so this requires taking into account
old plane state's do_async_flip flag.
v2: - Removed redundant new_plane_state->do_async_flip
check from needs_async_flip_wm_override condition
(Ville Syrjälä)
- Extract dg2_async_flip_optimization to separate
function(Ville Syrjälä)
- Check for plane->async_flip instead of plane_id
(Ville Syrjälä)
v3: - Rename "needs_async_flip_wm_override" to
"intel_plane_do_async_flip" and move all the required
checks there (Ville Syrjälä)
- Rename "dg2_async_flip_optimization" to
"use_minimal_wm0_only" (Ville Syrjälä)
v4: - Swap new/old_crtc_state in intel_plane_do_async_flip
argument list(Ville Syrjälä)
- Use plane->base.dev to grab i915 pointer in
intel_plane_do_async_flip(Ville Syrjälä)
- Remove const modifier from plane parameter in
use_minimal_wm0_only(Ville Syrjälä)
- Swap also new/old_crtc_state at intel_plane_do_async_flip
call site(Ville Syrjälä)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220124094929.31722-1-stanislav.lisovskiy@intel.com
Sometimes we might need to change the way we calculate
watermarks, based on which particular plane it is calculated
for. Thus it would be convenient to pass plane struct to those
functions.
v2: Pass plane instead of plane_id
v3: Do not pass plane to skl_cursor_allocation(Ville Syrjälä)
v4: - Make intel_crtc_get_plane static again(Ville Syrjälä)
- s/cursor_plane/plane(Ville Syrjälä)
- Pass plane to skl_compute_wm_* instead of plane_id(Ville Syrjälä)
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220124090653.14547-2-stanislav.lisovskiy@intel.com
Add display/intel_display_trace.[ch] for defining display
tracepoints. The main goal is to reduce cross-includes between gem and
display. It would be possible split up tracing even further, but that
would lead to more boilerplate.
We end up having to include intel_crtc.h in a few places because it was
pulled in implicitly via intel_de.h -> i915_trace.h -> intel_crtc.h, and
that's no longer the case.
There should be no changes to tracepoints.
v3:
- Rebase
v2:
- Define TRACE_INCLUDE_PATH relative to define_trace.h (Chris)
- Remove useless comments (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7862ad764fbd0748d903c76bc632d3d277874e5b.1638961423.git.jani.nikula@intel.com
TileF(Tile4 in bspec) format is 4K tile organized into
64B subtiles with same basic shape as for legacy TileY
which will be supported by Display13.
v2: - Fixed wrong case condition(Jani Nikula)
- Increased I915_FORMAT_MOD_F_TILED up to 12(Imre Deak)
v3: - s/I915_TILING_F/TILING_4/g
- s/I915_FORMAT_MOD_F_TILED/I915_FORMAT_MOD_4_TILED/g
- Removed unneeded fencing code
v4: - Rebased, fixed merge conflict with new table-oriented
format modifier checking(Stan)
- Replaced the rest of "Tile F" mentions to "Tile 4"(Stan)
v5: - Still had to remove some Tile F mentionings
- Moved has_4tile from adlp to DG2(Ramalingam C)
- Check specifically for DG2, but not the Display13(Imre)
v6: - Moved Tile4 associating struct for modifier/display to
the beginning(Imre Deak)
- Removed unneeded case I915_FORMAT_MOD_4_TILED modifier
checks(Imre Deak)
- Fixed I915_FORMAT_MOD_4_TILED to be 9 instead of 12
(Imre Deak)
v7: - Fixed display_ver to { 13, 13 }(Imre Deak)
- Removed redundant newline(Imre Deak)
Reviewed-by: Imre Deak <imre.deak@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122211420.31584-1-stanislav.lisovskiy@intel.com
The FBC register defines are a mess:
- namespace changes between DPFC_, FBC_, and some platform
specific prefix at a whim
- ilk+ reuses most g4x bits but still has some separate bit
defines elsewhere
- it's not clear from the defines that the bit defines are
shared
So let's clean it up:
- both g4x and ilk register share the same defines now
- only defines which conflict have a _PLATFORM suffix, everyone
else just gets comments to indicate which platforms do what
- namespace is consistent DPFC_ now
- SNB system agent fence registers also get a consistent namespace
- REG_BIT() & co. for everything
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104144520.22605-13-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
A new warning in clang points out a place in this file where a bitwise
OR is being used with boolean types:
drivers/gpu/drm/i915/intel_pm.c:3066:12: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
changed = ilk_increase_wm_latency(dev_priv, dev_priv->wm.pri_latency, 12) |
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This construct is intentional, as it allows every one of the calls to
ilk_increase_wm_latency() to occur (instead of short circuiting with
logical OR) while still caring about the result of each call.
To make this clearer to the compiler, use the '|=' operator to assign
the result of each ilk_increase_wm_latency() call to changed, which
keeps the meaning of the code the same but makes it obvious that every
one of these calls is expected to happen.
Link: https://github.com/ClangBuiltLinux/linux/issues/1473
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Dávid Bolvanský <david.bolvansky@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014211916.3550122-1-nathan@kernel.org
The memory latency values returned by pcode on DG2 are in units of "2
usec" rather than 1 usec on all other platforms. I.e., we need to
double the value returned by pcode to obtain the true latency value.
The bspec wording here was a bit ambiguous as to whether it wanted us to
multiply or divide the pcode value by two, but we confirmed offline with
the hardware team that we need to double the value the pcode gives us;
this change is intended to support a larger range of potential latency
values.
Bspec: 49326
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210820225710.401136-1-matthew.d.roper@intel.com
UAPI Changes:
- Add I915_MMAP_OFFSET_FIXED
On devices with local memory `I915_MMAP_OFFSET_FIXED` is the only valid
type. On devices without local memory, this caching mode is invalid.
As caching mode when specifying `I915_MMAP_OFFSET_FIXED`, WC or WB will
be used, depending on the object placement on creation. WB will be used
when the object can only exist in system memory, WC otherwise.
Userspace: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11888
- Reinstate the mmap ioctl for (already released) integrated Gen12 platforms
Rationale: Otherwise media driver breaks eg. for ADL-P. Long term goal is
still to sunset the IOCTL even for integrated and require using mmap_offset.
- Reject caching/set_domain IOCTLs on discrete
Expected to become immutable property of the BO
- Disallow changing context parameters after first use on Gen12 and earlier
- Require setting context parameters at creation on platforms after Gen12
Rationale (for both): Allow less dynamic changes to the context to simplify
the implementation and avoid user shooting theirselves in the foot.
- Drop I915_CONTEXT_PARAM_RINGSIZE
Userspace PR for compute-driver has not been merged
- Drop I915_CONTEXT_PARAM_NO_ZEROMAP
Userspace PR for libdrm / Beignet was never landed
- Drop CONTEXT_CLONE API
Userspace PR for Mesa was never landed
- Drop getparam support for I915_CONTEXT_PARAM_ENGINES
Only existed for symmetry wrt. setparam, never used.
- Disallow bonding of virtual engines
Drop the prep work, no hardware has been released needing it.
- (Implicit) Disable gpu relocations
Media userspace was the last userspace to still use them. They
have converted so performance can be regained with an update.
Core Changes:
- Merge topic branch 'topic/i915-ttm-2021-06-11' (from Maarten)
- Merge topic branch 'topic/revid_steppings' (from Matt R)
- Merge topic branch 'topic/xehp-dg2-definitions-2021-07-21' (from Matt R)
- Backmerges drm-next (Rodrigo)
Driver Changes:
- Initial workarounds for ADL-P (Clint)
- Preliminary code for XeHP/DG2 (Stuart, Umesh, Matt R, Prathap, Ram,
Venkata, Akeem, Tvrtko, John, Lucas)
- Fix ADL-S DMA mask size to 39 bits (Tejas)
- Remove code for CNL (Lucas)
- Add ADL-P GuC/HuC firmwares (John)
- Update HuC to 7.9.3 for TGL/ADL-S/RKL (John)
- Fix -EDEADLK handling regression (Ville)
- Implement Wa_1508744258 for DG1 and Gen12 iGFX (Jose)
- Extend Wa_1406941453 to ADL-S (Jose)
- Drop unnecessary workarounds per stepping for SKL/BXT/ICL (Matt R)
- Use fuse info to enable SFC on Gen12 (Venkata)
- Unconditionally flush the pages on acquire on EHL/JSL (Matt A)
- Probe existence of backing struct pages upon userptr creation (Chris, Matt A)
- Add an intermediate GEM proto-context to delay real context creation (Jason)
- Implement SINGLE_TIMELINE with a syncobj (Jason)
- Set the watchdog timeout directly in intel_context_set_gem (Jason)
- Disallow userspace from creating contexts with too many engines (Jason)
- Revert "drm/i915/gem: Asynchronous cmdparser" (Jason)
- Revert "drm/i915: Propagate errors on awaiting already signaled fences" (Jason)
- Revert "drm/i915: Skip over MI_NOOP when parsing" (Jason)
- Revert "drm/i915: Shrink the GEM kmem_caches upon idling" (Daniel)
- Always let TTM handle object migration (Jason)
- Correct the locking and pin pattern for dma-buf (Thomas H, Michael R, Jason)
- Migrate to system at dma-buf attach time (Thomas, Michael R)
- MAJOR refactoring of the GuC backend code to allow for enabling on Gen11+
(Matt B, John, Michal Wa., Fernando, Daniele, Vinay)
- Update GuC firmware interface to v62.0.0 (John, Michal Wa., Matt B)
- Add GuCRC feature to hand over the control of HW RC6 to the GuC on
Gen12+ when GuC submission is enabled (Vinay, Sujaritha, Daniele,
John, Tvrtko)
- Use the correct IRQ during resume and eliminate DRM IRQ midlayer (Thomas Z)
- Add pipelined page migration and clearing (Chris, Thomas H)
- Use TTM for system memory on discrete (Thomas H)
- Implement object migration for display vs. dma-buf (Thomas H)
- Perform execbuffer object locking as a separate step (Thomas H)
- Add support for explicit L3BANK steering (Matt, Daniele)
- Remove duplicated call to ops->pread (Daniel)
- Fix pagefault disabling in the first execbuf slowpath (Daniel)
- Simplify userptr locking (Thomas H)
- Improvements to the GuC CTB code (Matt B, John)
- Make GT workaround upper bounds exclusive (Matt R)
- Check for nomodeset in i915_init() first (Daniel)
- Delete now unused gpu reloc code (Daniel)
- Document RFC plans for GuC submission, DRM scheduler and new parallel
submit uAPI (Matt B)
- Reintroduce buddy allocator this time with TTM (Matt A)
- Support forcing page size with LMEM (Matt A)
- Add i915_sched_engine to abstract a submission queue between backends (Matt B)
- Use accelerated move in TTM (Ram)
- Fix memory leaks from TTM backend (Thomas H)
- Introduce WW transaction helper (Thomas H)
- Improve debug Kconfig texts a bit (Daniel)
- Unify user object creation code (Jason)
- Use a table for i915_init/exit (Jason)
- Move slabs to module init/exit (Daniel)
- Remove now unused i915_globals (Daniel)
- Extract i915_module.c (Daniel)
- Consistently use adl-p/adl-s in WA comments (Jose)
- Finish INTEL_GEN and friends conversion (Lucas)
- Correct variable/function namings (Lucas)
- Code checker fixes (Wan, Matt A)
- Tracepoint improvements (Matt B)
- Kerneldoc improvements (Tvrtko, Jason, Matt A, Maarten)
- Selftest improvements (Chris, Matt A, Tejas, Thomas H, John, Matt B,
Rahul, Vinay)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YQ0JmYiXhGskNcrI@jlahtine-mobl.ger.corp.intel.com
DG2 extends our DDB to four DBuf slices; pipes A+B only have access to
the first two slices, whereas pipes C+D only have access to the second
two.
Confusingly, our bspec decided to switch from 1-based numbering
of dbuf slices (S1, S2) to 0-based numbering (S0, S1, S2, S3) in
Display13. At the moment we're using the 0-based number scheme for the
DBUF_CTL_S() register addressing, but the 1-based number scheme in the
actual slice assignment tables. We may want to consider switching the
assignment over to 0-based numbering too at some point...
Bspec: 49255
Bspec: 50057
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721223043.834562-16-matthew.d.roper@intel.com