After a TTM move or object init we need to update the i915 gem flags and
caching settings to reflect the new placement. Currently caching settings
are not changed during the lifetime of an object, although that might
change moving forward if we run into performance issues or issues with
WC system page allocations.
Also introduce gpu_binds_iomem() and cpu_maps_iomem() to clean up the
various ways we previously used to detect this.
Finally, initialize the TTM object reserved to be able to update
flags and caching before anyone else gets hold of the object.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-3-thomas.hellstrom@linux.intel.com
The object ops i915_GEM_OBJECT_HAS_IOMEM and the object
I915_BO_ALLOC_STRUCT_PAGE flags are considered immutable by
much of our code. Introduce a new mem_flags member to hold these
and make sure checks for these flags being set are either done
under the object lock or with pages properly pinned. The flags
will change during migration under the object lock.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-2-thomas.hellstrom@linux.intel.com
In
commit ebc0808fa2
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Oct 18 13:02:51 2016 +0100
drm/i915: Restrict pagefault disabling to just around copy_from_user()
we entirely missed that there's a slow path call to eb_relocate_entry
(or i915_gem_execbuffer_relocate_entry as it was called back then)
which was left fully wrapped by pagefault_disable/enable() calls.
Previously any issues with blocking calls where handled by the
following code:
/* we can't wait for rendering with pagefaults disabled */
if (pagefault_disabled() && !object_is_idle(obj))
return -EFAULT;
Now at this point the prefaulting was still around, which means in
normal applications it was very hard to hit this bug. No idea why the
regressions in igts weren't caught.
Now this all changed big time with 2 patches merged closely together.
First
commit 2889caa923
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jun 16 15:05:19 2017 +0100
drm/i915: Eliminate lots of iterations over the execobjects array
removes the prefaulting from the first relocation path, pushing it into
the first slowpath (of which this patch added a total of 3 escalation
levels). This would have really quickly uncovered the above bug, were
it not for immediate adding a duct-tape on top with
commit 7dd4f6729f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Jun 16 15:05:24 2017 +0100
drm/i915: Async GPU relocation processing
by pushing all all the relocation patching to the gpu if the buffer
was busy, which avoided all the possible blocking calls.
The entire slowpath was then furthermore ditched in
commit 7dc8f11437
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Mar 11 16:03:10 2020 +0000
drm/i915/gem: Drop relocation slowpath
and resurrected in
commit fd1500fcd4
Author: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Date: Wed Aug 19 16:08:43 2020 +0200
Revert "drm/i915/gem: Drop relocation slowpath".
but this did not further impact what's going on.
Since pagefault_disable/enable is an atomic section, any sleeping in
there is prohibited, and we definitely do that without gpu relocations
since we have to wait for the gpu usage to finish before we can patch
up the relocations.
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618214503.1773805-1-daniel.vetter@ffwll.ch
A little bit of documentation covering the topics of engine discovery,
context engine maps and virtual engines. It is not very detailed but
supposed to be a starting point of giving a brief high level overview of
general principles and intended use cases.
v2:
* Have the text in uapi header and link from there.
v4:
* Link from driver-uapi.rst.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618150036.2507653-1-tvrtko.ursulin@linux.intel.com
GuC ABI documentation is now ready to be included in i915.rst
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-4-matthew.brost@intel.com
Most of the changes to the 62.0.0 firmware revolved around CTB
communication channel. Conform to the new (stable) CTB protocol.
v2:
(Michal)
Add values back to kernel DOC for actions
(Docs)
Add 'CT buffer' back in to fix warning
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[mattrope: Tweaked kerneldoc while pushing as suggested by Daniele/Michal]
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-3-matthew.brost@intel.com
New GuC firmware will unify format of MMIO and CTB H2G messages.
Introduce their definitions now to allow gradual transition of
our code to match new changes.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-2-matthew.brost@intel.com
The submission tasklet operates on i915_sched_engine, thus it is the
correct place for it.
v3:
(Jason Ekstrand)
Change sched_engine->engine to a void* private data pointer
Add kernel doc
v4:
(Daniele)
Update private_data comment
Set queue_priority_hint in kick_execlists
v5:
(CI)
Rebase and fix build error
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com
Rather passing around an intel_engine_cs in the scheduling code, pass
around a i915_sched_engine.
v3:
(Jason Ekstrand)
Add READ_ONCE around rq->engine in lock_sched_engine
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-8-matthew.brost@intel.com
Not all back-ends require a kick after a scheduling update, so make the
kick a call-back function that the back-end can opt-in to. Also move
the current kick function from the scheduler to the execlists file as it
is specific to that back-end.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-7-matthew.brost@intel.com
The schedule function should be in the schedule object.
v3:
(Jason Ekstrand)
Add kernel doc
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
Move active request tracking and its lock to i915_sched_engine. This
lock is also the submission lock so having it in the i915_sched_engine
is the correct place.
v3:
(Jason Ekstrand)
Add kernel doc
v6:
Rebase
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
Rather than touching schedule state in the generic PM code, reset the
priolist allocation when empty in the submission code. Add a wrapper
function to do this and update the backends to call it in the correct
place.
v3:
(Jason Ekstrand)
Update patch commit message with a better description
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-4-matthew.brost@intel.com
Introduce i915_sched_engine object which is lower level data structure
that i915_scheduler / generic code can operate on without touching
execlist specific structures. This allows additional submission backends
to be added without breaking the layering. Currently the execlists
backend uses 1 of these object per each engine (physical or virtual) but
future backends like the GuC will point to less instances utilizing the
reference counting.
This is a bit of detour to integrating the i915 with the DRM scheduler
but this object will still exist when the DRM scheduler lands in the
i915. It will however look a bit different. It will encapsulate the
drm_gpu_scheduler object plus and common variables (to the backends)
related to scheduling. Regardless this is a step in the right direction.
This patch starts the aforementioned transition by moving the priolist
into the i915_sched_engine object.
v3:
(Jason Ekstrand)
Update comment next to intel_engine_cs.virtual
Add kernel doc
(Checkpatch)
Fix double the in commit message
v4:
(Daniele)
Update comment message.
Add comment about subclass field
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-2-matthew.brost@intel.com
When we resurrected the selftest we forgot to add back the selftest()
hook, meaning the test is not currently run.
References: d148738923 ("drm/i915/ttm Initialize the ttm device and memory managers")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618133150.700375-1-matthew.auld@intel.com
We have assumed that if the current placement was not the requested
placement, but instead one of the busy placements, a TTM move would have
been triggered. That is not the case.
So when we initially place LMEM objects in "Limbo", (that is system
placement without any pages allocated), to be able to defer clearing
objects until first get_pages(), the first get_pages() would happily keep
objects in system memory if that is one of the allowed placements. And
since we don't yet support i915 GEM system memory from TTM, everything
breaks apart.
So make sure we try the requested placement first, if no eviction is
needed. If that fails, retry with all allowed placements also allowing
evictions. Also make sure we handle TTM failure codes correctly.
Also temporarily (until we support i915 GEM system on TTM), restrict
allowed placements to the requested placement to avoid things falling
apart should LMEM be full.
Fixes: 38f28c0695 ("drm/i915/ttm: Calculate the object placement at get_pages time")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618132515.163277-1-thomas.hellstrom@linux.intel.com
Because Render Power Gating restricts us to just a single subslice as a
valid steering target for reads of multicast registers in a SUBSLICE
range, the default steering we setup at init may not lead to a suitable
target for L3BANK multicast register. In cases where it does not, use
explicit runtime steering whenever an L3BANK multicast register is read.
While we're at it, let's simplify the function a little bit and drop its
support for gen10/CNL since no such platforms ever materialized for real
use. Multicast register steering is already an area that causes enough
confusion; no need to complicate it with what's effectively dead code.
v2:
- Use gt->uncore instead of gt->i915->uncore. (Tvrtko)
- Use {} as table terminator. (Rodrigo)
v3:
- L3bank fuse register is a disable mask rather than an enable mask.
We need to invert it before use. (CI)
v4:
- L3bank ID goes in the subslice field, not the slice field. (CI)
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617211425.1943662-4-matthew.d.roper@intel.com
Although most of our multicast registers are replicated per-subslice, we
also have a small number of multicast registers that are replicated
per-l3 bank instead. For both types of multicast registers we need to
make sure we steer reads of these registers to a valid instance.
Ideally we'd like to find a specific instance ID that would steer reads
of either type of multicast register to a valid instance (i.e., not
fused off and not powered down), but sometimes the combination of
part-specific fusing and the additional restrictions imposed by Render
Power Gating make it impossible to find any overlap between the set of
valid subslices and valid l3 banks. This problem will become even more
noticeable on our upcoming platforms since they will be adding
additional types of multicast registers with new types of replication
and rules for finding valid instances for reads.
To handle this we'll continue to pick a suitable subslice instance at
driver startup and program this as the default (sliceid,subsliceid)
setting in the steering control register (0xFDC). In cases where we
need to read another type of multicast GT register, but the default
subslice steering would not correspond to a valid instance, we'll
explicitly re-steer the single read to a valid value, perform the read,
and then reset the steering to it's "subslice" default.
This patch adds the general functionality to prepare for this explicit
steering of other multicast register types. We'll plug L3 bank steering
into this in the next patch, and then add additional types of multicast
registers when the support for our next upcoming platform arrives.
v2:
- Use entry->end==0 as table terminator. (Rodrigo)
- Grab forcewake in wa_list_verify() now that we're using accessors
that assume forcewake is already held.
v3:
- Fix loop condition when iterating over steering range tables.
(Rodrigo)
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617211425.1943662-3-matthew.d.roper@intel.com
New steering cases will be added in the follow-up patches, so prepare a
common helper to avoid code duplication.
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617211425.1943662-2-matthew.d.roper@intel.com
To help avoid evicting already resident buffers from the batch we're
processing, perform locking as a separate step.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210615113600.30660-1-thomas.hellstrom@linux.intel.com
It's unused with the exception of selftest. Replace a call in the
memory_region live selftest with a call into a corresponding
function in the new migrate code.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-13-thomas.hellstrom@linux.intel.com
Set up a default migration context on the GT and use it from the
selftests.
Add a perf selftest and make sure we exercise LMEM if available.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-10-thomas.hellstrom@linux.intel.com
Update the PTE and emit a clear within a single unpreemptible packet
such that we can schedule and pipeline clears.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-9-thomas.hellstrom@linux.intel.com
If we pipeline the PTE updates and then do the copy of those pages
within a single unpreemptible command packet, we can submit the copies
and leave them to be scheduled without having to synchronously wait
under a global lock. In order to manage migration, we need to
preallocate the page tables (and keep them pinned and available for use
at any time), causing a bottleneck for migrations as all clients must
contend on the limited resources. By inlining the ppGTT updates and
performing the blit atomically, each client only owns the PTE while in
use, and so we can reschedule individual operations however we see fit.
And most importantly, we do not need to take a global lock on the shared
vm, and wait until the operation is complete before releasing the lock
for others to claim the PTE for themselves.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-8-thomas.hellstrom@linux.intel.com
Since the ww transaction endpoint easily end up far out-of-scope of
the objects on the ww object list, particularly for contending lock
objects, make sure we reference objects on the list so they don't
disappear under us.
This comes with a performance penalty so it's been debated whether this
is really needed. But I think this is motivated by the fact that locking
is typically difficult to get right, and whatever we can do to make it
simpler for developers moving forward should be done, unless the
performance impact is far too high.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-2-thomas.hellstrom@linux.intel.com
intel_region_ttm_node_free is no longer used. Also fixup the related
kerneldoc.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617083719.497619-1-matthew.auld@intel.com
We now have bo->page_alignment which perfectly describes what we need if
we have min page size restrictions for lmem. We can also drop the flag
here, since this is the default behaviour for all objects.
v2(Thomas):
- bo->page_alignment is in page units
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-7-matthew.auld@intel.com
Move back to the buddy allocator for managing device local memory, and
restore the lost mock selftests. Keep around the range manager related
bits, since we likely need this for managing stolen at some point. For
stolen we also don't need to reserve anything so no need to support a
generic reserve interface.
v2(Thomas):
- bo->page_alignment is in page units, not bytes
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-6-matthew.auld@intel.com
Now that ttm_resource_manager just returns a generic ttm_resource we
don't need to reference the mm_node stuff anymore which mostly only
makes sense for drm_mm_node. In the next few patches we want switch over
to the ttm_buddy_man which is just another type of ttm_resource so
reflect that in the naming.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-5-matthew.auld@intel.com
Currently we just ignore the I915_BO_ALLOC_CONTIGUOUS flag, which is
fine since everything is already contiguous with the ttm range manager.
However in the next patch we want to switch over to the ttm buddy
manager, where allocations are by default not contiguous.
v2(Thomas):
- Forward ALLOC_CONTIG for all regions
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-4-matthew.auld@intel.com
Instead of relying on a static placement, calculate at get_pages() time.
This should work for LMEM regions and system for now. For stolen we need
to take preallocated range into account. That will if needed be added
later.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-3-matthew.auld@intel.com
We need to be able to build an sg table from our list of buddy blocks,
so that we can later plug this into our ttm backend, and replace our use
of the range manager.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-2-matthew.auld@intel.com
Add back our standalone i915_buddy allocator and integrate it into a
ttm_resource_manager. This will plug into our ttm backend for managing
device local-memory in the next couple of patches.
v2(Thomas):
- Return -ENOSPC from the buddy; ttm expects this in order to
trigger eviction
- Drop the unnecessary inline
- bo->page_alignment is in page units
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-1-matthew.auld@intel.com
Most of the context WA are already implemented.
Adding adl_p platform tag to reflect so.
v2: adjust comments for clarity (MattR)
BSpec: 54369
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210608174721.17593-1-clinton.a.taylor@intel.com
Use an rwlock instead of spinlock for the global notifier lock
to reduce risk of contention in execbuf.
Protect object state with the object lock whenever possible rather
than with the global notifier lock
Don't take an explicit page_ref in userptr_submit_init() but rather
call get_pages() after obtaining the page list so that
get_pages() holds the page_ref. This means we don't need to call
userptr_submit_fini(), which is needed to avoid awkward locking
in our upcoming VM_BIND code.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610143525.624677-1-thomas.hellstrom@linux.intel.com
- Convert i915 lmem handling to ttm.
- Add a patch to temporarily add a driver_private member to vma_node.
- Use this to allow mixed object mmap handling for i915.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmDDLAkACgkQ/lWMcqZw
E8PZRA//Q27B74kNTbCSH+ou/htOEFsc37ADMIDUYVR0aVCQxuUyJICXek3tm9Po
91YzhEVzRv+ig3UxekvFSN1df85PaArDdlYdAP14TS3RvXzemMY7SwOZquWlVMC5
/21Eimagf8aTud3BwVWNyDL+OTADhpnrRBSGqtFU7m5YzAv2LgBzV6ftyj98u6nZ
F5T/n4I+uMUK9mbcqopmRZ+EzqM48LtiucP0SgKKGL/2bL9xQCCoV7sF8bU5bTc5
DIPCxuLBMUahLRy5JEpIA8WwtP+DzCchjBuZqRsYQhP5Zk+pe8PTbmveYo4cfye8
b5a2SIJyQSF3yN2fkbmK8W4OrwxPYNG9FwTEzL4kCspSJ4mOmk4MTSOilmcbWc1T
+h/nX4sbMm7oVSraEMrjBixgaA+X6UjcTyH+mf26vir4GiKI9klOaSMMAFSlX19d
+C0IdW6gny8UJnb8vDR5P7/GBizxk7N9uuD+IeVc3NCkMMZIOwTDXLb9ebivJUtm
309uvhkWkCqmJULgRMXCAXD2CC0JqGgU5Wrrb/loKXNAkEtCaevcMbiPr43+aTqn
RX5IAQmpPwYz/sHJIefxsrBRwaCuQeZsKzMNCVeGu74MaweDKn5AB/851vCmTETO
Y5VNRXdKZ1AufduUCAXrfwcGtqT8FuGWzH6pqJ+e4ikFLCuHXD8=
=mB3w
-----END PGP SIGNATURE-----
Merge tag 'topic/i915-ttm-2021-06-11' of git://anongit.freedesktop.org/drm/drm-misc into drm-intel-gt-next
drm-misc and drm-intel pull request for topic/i915-ttm:
- Convert i915 lmem handling to ttm.
- Add a patch to temporarily add a driver_private member to vma_node.
- Use this to allow mixed object mmap handling for i915.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/eb71ee2d-3413-6ca8-0b7c-a58695f00b77@linux.intel.com