The module init code is somewhat misplaced in i915_pci.c, since it
needs to pull in init/exit functions from every part of the driver and
pollutes the include list a lot.
Extract an i915_module.c file which pulls all the bits together, and
allows us to massively trim the include list of i915_pci.c.
The downside is that have to drop the error path check Jason added to
catch when we set up the pci driver too early. I think that risk is
acceptable for this pretty nice include.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-11-daniel.vetter@ffwll.ch
With the global kmem_cache shrink infrastructure gone there's nothing
special and we can convert them over.
I'm doing this split up into each patch because there's quite a bit of
noise with removing the static global.slab_vmas to just a
slab_vmas.
We have to keep i915_drv.h include in i915_globals otherwise there's
nothing anymore that pulls in GEM_BUG_ON.
v2: Make slab static (Jason, 0day)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-9-daniel.vetter@ffwll.ch
Implement a simple static mapping algorithm of the i915 priority levels
(int, -1k to 1k exposed to user) to the 4 GuC levels. Mapping is as
follows:
i915 level < 0 -> GuC low level (3)
i915 level == 0 -> GuC normal level (2)
i915 level < INT_MAX -> GuC high level (1)
i915 level == INT_MAX -> GuC highest level (0)
We believe this mapping should cover the UMD use cases (3 distinct user
levels + 1 kernel level).
In addition to static mapping, a simple counter system is attached to
each context tracking the number of requests inflight on the context at
each level. This is needed as the GuC levels are per context while in
the i915 levels are per request.
v2:
(Daniele)
- Add BUILD_BUG_ON to enforce ordering of priority levels
- Add missing lockdep to guc_prio_fini
- Check for return before setting context registered flag
- Map DISPLAY priority or higher to highest guc prio
- Update comment for guc_prio
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-33-matthew.brost@intel.com
Some testing environments and some heavier tests are slower than
previous limits allowed for. For example, it can take multiple seconds
for the 'context has been reset' notification handler to reach the
'kill the requests' code in the 'active' version of the 'reset
engines' test. During which time the selftest gets bored, gives up
waiting and fails the test.
There is also an async thread that the selftest uses to pump work
through the hardware in parallel to the context that is marked for
reset. That also could get bored waiting for completions and kill the
test off.
Lastly, the flush at the of various test sections can also see
timeouts due to the large amount of work backed up. This is also true
of the live_hwsp_read test.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-32-matthew.brost@intel.com
When GuC submission is enabled, the GuC controls engine resets. Rather
than explicitly triggering a reset, the driver must submit a hanging
context to GuC and wait for the reset to occur.
Conversely, one of the tests specifically sends hanging batches to the
engines but wants them to sit around until a manual reset of the full
GT (including GuC itself). That means disabling GuC based engine
resets to prevent those from killing the hanging batch too soon. So,
add support to the scheduling policy helper for disabling resets as
well as making them quicker!
In GuC submission mode, the 'is engine idle' test basically turns into
'is engine PM wakelock held'. Independently, there is a heartbeat
disable helper function that the tests use. For unexplained reasons,
this acquires the engine wakelock before disabling the heartbeat and
only releases it when re-enabling the heartbeat. As one of the tests
tries to do a wait for idle in the middle of a heartbeat disabled
section, it is therefore guaranteed to always fail. Added a 'no_pm'
variant of the heartbeat helper that allows the engine to be asleep
while also having heartbeats disabled.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-31-matthew.brost@intel.com
This adds GuC backend support for i915_request_cancel(), which in turn
makes CONFIG_DRM_I915_REQUEST_TIMEOUT work.
This implementation makes use of fence while there are likely simplier
options. A fence was chosen because of another feature coming soon
which requires a user to block on a context until scheduling is
disabled. In that case we return the fence to the user and the user can
wait on that fence.
v2:
(Daniele)
- A comment about locking the blocked incr / decr
- A comments about the use of the fence
- Update commit message explaining why fence
- Delete redundant check blocked count in unblock function
- Ring buffer implementation
- Comment about blocked in submission path
- Shorter rpm path
v3:
(Checkpatch)
- Fix typos in commit message
(Daniel)
- Rework to simplier locking structure in guc_context_block / unblock
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
Changing the reset module parameter has no effect on a running GuC.
The corresponding entry in the ADS must be updated and then the GuC
informed via a Host2GuC message.
The new debugfs interface to module parameters allows this to happen.
However, connecting the parameter data address back to anything useful
is messy. One option would be to pass a new private data structure
address through instead of just the parameter pointer. However, that
means having a new (and different) data structure for each parameter
and a new (and different) write function for each parameter. This
method keeps everything generic by instead using a string lookup on
the directory entry name.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-22-matthew.brost@intel.com
In the case of a full GPU reset (e.g. because GuC has died or because
GuC's hang detection has been disabled), the driver can't rely on GuC
reporting the guilty context. Instead, the driver needs to scan all
active contexts and find one that is currently executing, as per the
execlist mode behaviour. In GuC mode, this scan is different to
execlist mode as the active request list is handled very differently.
Similarly, the request state dump in debugfs needs to be handled
differently when in GuC submission mode.
Also refactured some of the request scanning code to avoid duplication
across the multiple code paths that are now replicating it.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com
We receive notification of an engine reset from GuC at its
completion. Meaning GuC has potentially cleared any HW state
we may have been interested in capturing. GuC resumes scheduling
on the engine post-reset, as the resets are meant to be transparent,
further muddling our error state.
There is ongoing work to define an API for a GuC debug state dump. The
suggestion for now is to manually disable FW initiated resets in cases
where debug state is needed.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-19-matthew.brost@intel.com
The new GuC interface introduces an MMIO H2G command,
INTEL_GUC_ACTION_RESET_CLIENT, which is used to implement suspend. This
MMIO tears down any active contexts generating a context reset G2H CTB
for each. Once that step completes the GuC tears down the CTB
channels. It is safe to suspend once this MMIO H2G command completes
and all G2H CTBs have been processed. In practice the i915 will likely
never receive a G2H as suspend should only be called after the GPU is
idle.
Resume is implemented in the same manner as before - simply reload the
GuC firmware and reinitialize everything (e.g. CTB channels, contexts,
etc..).
v2:
(Michel / John H)
- INTEL_GUC_ACTION_RESET_CLIENT 0x5B01 -> 0x5507
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-12-matthew.brost@intel.com
Reset implementation for new GuC interface. This is the legacy reset
implementation which is called when the i915 owns the engine hang check.
Future patches will offload the engine hang check to GuC but we will
continue to maintain this legacy path as a fallback and this code path
is also required if the GuC dies.
With the new GuC interface it is not possible to reset individual
engines - it is only possible to reset the GPU entirely. This patch
forces an entire chip reset if any engine hangs.
v2:
(Michal)
- Check for -EPIPE rather than -EIO (CT deadlock/corrupt check)
v3:
(John H)
- Split into a series of smaller patches
v4:
(John H)
- Fix typo
- Add braces around if statements in reset code
v5:
(Checkpatch)
- Fix warnings
Cc: John Harrison <john.c.harrison@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <john.c.harrison@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-9-matthew.brost@intel.com
Hold a reference to the intel_context over life of an i915_request.
Without this an i915_request can exist after the context has been
destroyed (e.g. request retired, context closed, but user space holds a
reference to the request from an out fence). In the case of GuC
submission + virtual engine, the engine that the request references is
also destroyed which can trigger bad pointer dref in fence ops (e.g.
i915_fence_get_driver_name). We could likely change
i915_fence_get_driver_name to avoid touching the engine but let's just
be safe and hold the intel_context reference.
v2:
(John Harrison)
- Update comment explaining how GuC mode and execlists mode deal with
virtual engines differently
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-4-matthew.brost@intel.com
The serial number tracking of engines happens at the backend of
request submission and was expecting to only be given physical
engines. However, in GuC submission mode, the decomposition of virtual
to physical engines does not happen in i915. Instead, requests are
submitted to their virtual engine mask all the way through to the
hardware (i.e. to GuC). This would mean that the heart beat code
thinks the physical engines are idle due to the serial number not
incrementing. Which in turns means hangcheck does not work for
GuC virtual engines.
This patch updates the tracking to decompose virtual engines into
their physical constituents and tracks the request against each. This
is not entirely accurate as the GuC will only be issuing the request
to one physical engine. However, it is the best that i915 can do given
that it has no knowledge of the GuC's scheduling decisions.
Downside of this is that all physical engines constituting a GuC
virtual engine will be periodically unparked (even during just a single
context executing) in order to be pinged with a heartbeat request.
However the power and performance cost of this is not expected to be
measurable (due low frequency of heartbeat pulses) and it is considered
an easier option than trying to make changes to GuC firmware.
v2:
(Tvrtko)
- Update commit message
- Have default behavior if no vfunc present
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-3-matthew.brost@intel.com
EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it
possible for userspace to bypass the GTT caching bits set by the kernel,
as per the given object cache_level. This is troublesome since the heavy
flush we apply when first acquiring the pages is skipped if the kernel
thinks the object is coherent with the GPU. As a result it might be
possible to bypass the cache and read the contents of the page directly,
which could be stale data. If it's just a case of userspace shooting
themselves in the foot then so be it, but since i915 takes the stance of
always zeroing memory before handing it to userspace, we need to prevent
this.
v2: this time actually set cache_dirty in put_pages()
v3: move to get_pages() which looks simpler
BSpec: 34007
References: 046091758b ("Revert "drm/i915/ehl: Update MOCS table for EHL"")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Francisco Jerez <francisco.jerez.plata@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723105045.400841-2-matthew.auld@intel.com
Until we support p2p dma or as a complement to that, migrate data
to system memory at dma-buf attach time if possible.
v2:
- Rebase on dynamic exporter. Update the igt_dmabuf_import_same_driver
selftest to migrate if we are LMEM capable.
v3:
- Migrate also in the pin() callback.
v4:
- Migrate in attach
v5: (jason)
- Lock around the migration
v6: (jason)
- Move the can_migrate check outside the lock
- Rework the selftests to test more migration conditions. In
particular, SMEM, LMEM, and LMEM+SMEM are all checked.
v7: (mauld)
- Misc style nits
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-9-jason@jlekstrand.net
If our exported dma-bufs are imported by another instance of our driver,
that instance will typically have the imported dma-bufs locked during
dma_buf_map_attachment(). But the exporter also locks the same reservation
object in the map_dma_buf() callback, which leads to recursive locking.
So taking the lock inside _pin_pages_unlocked() is incorrect.
Additionally, the current pinning code path is contrary to the defined
way that pinning should occur.
Remove the explicit pin/unpin from the map/umap functions and move them
to the attach/detach allowing correct locking to occur, and to match
the static dma-buf drm_prime pattern.
Add a live selftest to exercise both dynamic and non-dynamic
exports.
v2:
- Extend the selftest with a fake dynamic importer.
- Provide real pin and unpin callbacks to not abuse the interface.
v3: (ruhl)
- Remove the dynamic export support and move the pinning into the
attach/detach path.
v4: (ruhl)
- Put pages does not need to assert on the dma-resv
v5: (jason)
- Lock around dma_buf_unmap_attachment() when emulating a dynamic
importer in the subtests.
- Use pin_pages_unlocked
v6: (jason)
- Use dma_buf_attach instead of dma_buf_attach_dynamic in the selftests
v7: (mauld)
- Use __i915_gem_object_get_pages (2 __underscores) instead of the
4 ____underscore version in the selftests
v8: (mauld)
- Drop the kernel doc from the static i915_gem_dmabuf_attach function
- Add missing "err = PTR_ERR()" to a bunch of selftest error cases
Reported-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-8-jason@jlekstrand.net
Without TTM, we have no such hook so we exit early but this is fine
because we use TTM on all LMEM platforms and, on integrated platforms,
there is no real migration. If we do have the hook, it's better to just
let TTM handle the migration because it knows where things are actually
placed.
This fixes a bug where i915_gem_object_migrate fails to migrate newly
created LMEM objects. In that scenario, the object has obj->mm.region
set to LMEM but TTM has it in SMEM because that's where all new objects
are placed there prior to getting actual pages. When we invoke
i915_gem_object_migrate, it exits early because, from the point of view
of the GEM object, it's already in LMEM and no migration is needed.
Then, when we try to pin the pages, __i915_ttm_get_pages is called
which, unaware of our failed attempt at a migration, places the object
in SMEM. This only happens on newly created objects because they have
this weird state where TTM thinks they're in SMEM, GEM thinks they're in
LMEM, and the reality is that they don't exist at all.
It's better if GEM just always calls into TTM and let's TTM handle
things. That way the lies stay better contained. Once the migration is
complete, the object will have pages, obj->mm.region will be correct,
and we're done lying.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-7-jason@jlekstrand.net