UAPI Changes:
- Weak parallel submission support for execlists
Minimal implementation of the parallel submission support for
execlists backend that was previously only implemented for GuC.
Support one sibling non-virtual engine.
Core Changes:
- Two backmerges of drm/drm-next for header file renames/changes and
i915_regs reorganization
Driver Changes:
- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)
- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)
- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)
- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
In the past we had a need to differentiate TGL U and TGL Y, there
was a different voltage swing table for each subplatform and some PCI
ids of this subplatforms are shared but it turned out that it was a
specification mistake and the voltage swing table was indeed the same
but we went ahead with that patch because we needed to differentiate
TGL U and Y from TGL H and by that time TGL H was embargoed so that
was the perfect way to land it upstream.
Now the embargo for TGL H is long past and now we even have
INTEL_TGL_12_GT1_IDS with all TGL H ids, so we can drop this PCI root
check and only rely in the PCI ids to differentiate TGL U and Y from
TGL H that actually has code differences.
Besides the simplification this will fix issues in virtualization
environments where the PCI root is virtualized and don't have the same
id as actual hardware.
v2:
- add and set INTEL_SUBPLATFORM_UY
Cc: Fred Gao <fred.gao@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Yu He <yu.he@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222141424.35165-1-jose.souza@intel.com
Currently we just leave the old gunk lying around in the crtc
state when userspace asks us to fully disable the crtc. That
doesn't match what the state would be had we never even enabled
the crtc in the first place. So let's make this consistent and
call intel_crtc_prepare_cleared_state() for disabled crtcs as well
(excluding bigjoiner slaves of course which have had their state
copied from the master).
I actually already did this once in commit fff13e63a1 ("drm/i915:
Clear most of crtc state when disabling the crtc") but then
commit 19f65a3dbf ("drm/i915: Try to make bigjoiner work in atomic
check") undid it all :(
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217103221.10405-5-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
For some reason we're flagging that we need to run through the
full modeset calculations (any_ms==true -> do cdclk/etc. checks)
if any crtc got initially flagged for a modeset and is not
enabled via the uapi. No idea why this is here since later on
(after all fastset handling) we do full run through the crtcs
and flag any_ms if anything still needs a full modeset. So let's
just throw out this early weirdo.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217103221.10405-4-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
With some VRR panels, user can turn VRR ON/OFF on the fly from the panel settings.
When VRR is turned OFF ,sends a long HPD to the driver clearing the Ignore MSA bit
in the DPCD. Currently the driver parses that onevery HPD but fails to reset
the corresponding VRR Capable Connector property.
Hence the userspace still sees this as VRR Capable panel which is incorrect.
Fix this by explicitly resetting the connector property.
v2: Reset vrr capable if status == connector_disconnected
v3: Use i915 and use bool vrr_capable (Jani Nikula)
v4: Move vrr_capable to after update modes call (Jani N)
Remove the redundant comment (Jan N)
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220215202601.22943-1-manasi.d.navare@intel.com
On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.
v2(Matthew Auld):
- replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
- replace input argument name 'actual_size' with 'new_size'
- add more validation checks for input arguments
- add overlaps check to avoid needless searching and splitting
- merged the below patch to see the feature in action
- add free unused pages support to i915 driver
- lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible
v3(Matthew Auld):
- remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function
v4:
- in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)
- modify the drm_buddy_block_trim() function return type
v5(Matthew Auld):
- revert drm_buddy_block_trim() function return type changes in v4
- modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop
v6:
- fix warnings reported by kernel test robot <lkp@intel.com>
v7:
- modify drm_buddy_block_trim() function doc description
- at drm_buddy_block_trim() handle non-allocated block as
a serious programmer error
- fix a typo
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-3-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.
The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.
v2:
- Fix alignment issues(Matthew Auld)
- Remove unnecessary list_empty check(Matthew Auld)
- merged the below patch to see the feature in action
- add top-down alloc support to i915 driver
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-2-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
- Make drm_buddy_alloc a single function to handle
range allocation and non-range allocation demands
- Implemented a new function alloc_range() which allocates
the requested power-of-two block comply with range limitations
- Moved order computation and memory alignment logic from
i915 driver to drm buddy
v2:
merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver
v3(Matthew Auld):
- Fix alignment issues and remove unnecessary list_empty check
- add more validation checks for input arguments
- make alloc_range() block allocations as bottom-up
- optimize order computation logic
- replace uint64_t with u64, which is preferred in the kernel
v4(Matthew Auld):
- keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
- keep alloc_range() implementation for end bias allocations
v5(Matthew Auld):
- modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915
v6(Matthew Auld):
- fixup potential uaf - If we are unlucky and can't allocate
enough memory when splitting blocks, where we temporarily
end up with the given block and its buddy on the respective
free list, then we need to ensure we delete both blocks,
and no just the buddy, before potentially freeing them
- fix warnings reported by kernel test robot <lkp@intel.com>
v7(Matthew Auld):
- revert fixup potential uaf
- keep __alloc_range() add node to the list logic same as
drm_buddy_alloc_blocks() by having a temporary list variable
- at drm_buddy_alloc_blocks() keep i915 range_overflows macro
and add a new check for end variable
v8:
- fix warnings reported by kernel test robot <lkp@intel.com>
v9(Matthew Auld):
- remove DRM_BUDDY_RANGE_ALLOCATION flag
- remove unnecessary function description
v10:
- keep DRM_BUDDY_RANGE_ALLOCATION flag as removing the flag
and replacing with (end < size) logic fails amdgpu driver load
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-1-Arunpravin.PaneerSelvam@amd.com
The VBT DSI video transfer mode field values have been defined in terms
of the VLV MIPI_VIDEO_MODE_FORMAT register. The ICL DSI code maps that
to ICL DSI_TRANS_FUNC_CONF() register. The values are the same, though
the shift is different.
Make a clean break and disassociate the values from each other. Assume
the values can be different, and translate the VBT value to VLV and ICL
register values as needed. Use the existing macros from intel_bios.h.
This will be useful in splitting the DSI register macros to files by DSI
implementation.
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217224023.3994777-1-jani.nikula@intel.com
struct drm_display_mode embeds a list head, so overwriting
the full struct with another one will corrupt the list
(if the destination mode is on a list). Use drm_mode_copy()
instead which explicitly preserves the list head of
the destination mode.
Even if we know the destination mode is not on any list
using drm_mode_copy() seems decent as it sets a good
example. Bad examples of not using it might eventually
get copied into code where preserving the list head
actually matters.
Obviously one case not covered here is when the mode
itself is embedded in a larger structure and the whole
structure is copied. But if we are careful when copying
into modes embedded in structures I think we can be a
little more reassured that bogus list heads haven't been
propagated in.
@is_mode_copy@
@@
drm_mode_copy(...)
{
...
}
@depends on !is_mode_copy@
struct drm_display_mode *mode;
expression E, S;
@@
(
- *mode = E
+ drm_mode_copy(mode, &E)
|
- memcpy(mode, E, S)
+ drm_mode_copy(mode, E)
)
@depends on !is_mode_copy@
struct drm_display_mode mode;
expression E;
@@
(
- mode = E
+ drm_mode_copy(&mode, &E)
|
- memcpy(&mode, E, S)
+ drm_mode_copy(&mode, E)
)
@@
struct drm_display_mode *mode;
@@
- &*mode
+ mode
Cc: Emma Anholt <emma@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218100403.7028-18-ville.syrjala@linux.intel.com
Platforms of XeHP and beyond support 3D surface (buffer) compression and
various compression formats. This is accomplished by an additional
compression control state (CCS) stored for each surface.
Gen 12 devices(TGL family and DG1) stores compression states in a separate
region of memory. It is managed by user-space and has an associated set of
user-space managed page tables used by hardware for address translation.
In Xe HP and beyond (XEHPSDV, DG2, etc), there is a new feature introduced
i.e Flat CCS. It replaced AUX page tables with a flat indexed region of
device memory for storing compression states.
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-14-ramalingam.c@intel.com
add test to check handling of misaligned offsets and sizes
v4:
* remove spurious blank lines
* explicitly cast intel_region_id to intel_memory_type in misaligned_pin
Reported-by: kernel test robot <lkp@intel.com>
v6:
* use NEEDS_COMPACT_PT instead of hard coding for DG2
v7:
* use i915_vma_unbind_unlocked in misalignment test
v8:
* handle stolen smem region returning -ENODEV due to
uninitialized on some setups
* avoid trying to test bad alignments on single page hole regions
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-9-ramalingam.c@intel.com
For local-memory objects we need to align the GTT addresses
to 64K, both for the ppgtt and ggtt.
We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.
For compact-pt we further align and pad lmem object GTT addresses
to 2MB to ensure PDEs contain consistent page sizes as
required by the HW.
v3:
* use needs_compact_pt flag to discriminate between
64K and 64K with compact-pt
* add i915_vm_obj_min_alignment
* use i915_vm_obj_min_alignment to round up vma reservation
if compact-pt instead of hard coding
v5:
* fix i915_vm_obj_min_alignment for internal objects which
have no memory region
v6:
* tiled_blits_create correctly pick largest required alignment
v8:
* i915_vm_min_alignment protect against array overflow for mock region
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-7-ramalingam.c@intel.com