After staring at the breadcrumb enabling/cancellation and coming to the
conclusion that the cause of the mysterious stale breadcrumbs must the
act of submitting a completed requests, we can then redirect those
completed requests onto a dedicated signaled_list at the time of
construction and so eliminate intel_engine_transfer_stale_breadcrumbs().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since the breadcrumb enabling/cancelling itself is serialised by the
breadcrumbs.irq_lock, with a bit of care we can remove the outer
serialisation with i915_request.lock for concurrent
dma_fence_enable_signaling(). This has the important side-effect of
eliminating the nested i915_request.lock within request submission.
The challenge in serialisation is around the unsubmission where we take
an active request that wants a breadcrumb on the signaling engine and
put it to sleep. We do not want a concurrent
dma_fence_enable_signaling() to attach a breadcrumb as we unsubmit, so
we must mark the request as no longer active before serialising with the
concurrent enable-signaling.
On retire, we serialise with the concurrent enable-signaling, but
instead of clearing ACTIVE, we mark it as SIGNALED.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Before we can execute a request, we must wait for all of its vma to be
bound. This is a frequent operation for which we can optimise away a
few atomic operations (notably a cmpxchg) in lieu of the RCU protection.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-7-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
As the conversion between idle-barrier and full i915_active_fence is
already serialised by explicit memory barriers, we can reduce the
spinlock in i915_active_acquire_preallocate_barrier() for finding an
idle-barrier to reuse to an RCU read lock to ensure the fence remains
valid, only taking the spinlock for the update of the rbtree itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-6-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rather than require the next timeline after idling to match the MRU
before idling, reset the index on the node and allow it to match the
first request. However, this requires cmpxchg(u64) and so is not trivial
on 32b, so for compatibility we just fallback to keeping the cached node
pointing to the MRU timeline.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-5-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Whenever an i915_active idles, we prune its tree of old fence slots to
prevent a gradual leak should it be used to track many, many timelines.
The downside is that we then have to frequently reallocate the rbtree.
A compromise is that we keep the most recently used fence slot, and
reuse that for the next active reference as that is the most likely
timeline to be reused.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Sometimes we have to be very careful not to allocate underneath a mutex
(or spinlock) and yet still want to track activity. Enter
i915_active_acquire_for_context(). This raises the activity counter on
i915_active prior to use and ensures that the fence-tree contains a slot
for the context.
v2: Refactor active_lookup() so it can be called again before/after
locking to resolve contention. Since we protect the rbtree until we
idle, we can do a lockfree lookup, with the caveat that if another
thread performs a concurrent insertion, the rotations from the insert
may cause us to not find our target. A second pass holding the treelock
will find the target if it exists, or the place to perform our
insertion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-3-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If no active callback is defined for i915_active, we do not need to
serialise its enabling with the mutex. We still do only want to call the
debug activate once, and must still serialise with a concurrent retire.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731085015.32368-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we pass around encoded parameters to the kernel context
constructor using the ce->timeline pointer, we can no longer assert that
it should be zero for mock timeline construction.
Fixes: d1bf5dd8f6 ("drm/i915/gt: Support multiple pinned timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731102206.6793-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Updated Fixes: link after rebasing and reordering into drm-intel-gt-next branch]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We need to ensure that the list is valid prior to marking the node as
retrievable, otherwise we may see two threads compete over the same node
in intel_gt_get_buffer_pool(). If the first thread acquires and releases
the node in the same jiffie, the second thread may then acquire it (as
the jiffie now again matches the expected value) and claim the node
before it is put back into the list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730134049.8822-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We may need to allocate more than one pinned context/timeline for each
engine which can utilise the per-engine HWSP, so we need to give each
a different offset within it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730183906.25422-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Avoid exposing a partially constructed context by deferring the
list_add() from the initial construction to the end of registration.
Otherwise, if we peek into the list of contexts from inside debugfs, we
may see the partially constructed context and chase down some dangling
incomplete pointers.
Reported-by: CQ Tang <cq.tang@intel.com>
Fixes: 3aa9945a52 ("drm/i915: Separate GEM context construction and registration to userspace")
References: f6e8aa3871 ("drm/i915: Report the number of closed vma held by each context in debugfs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: <stable@vger.kernel.org> # v5.2+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730092856.23615-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
A last minute change, that unfortunately broke CI so badly it declared
SUCCESS, was to refactor the debug free all buffer pool code to reuse
the normal worker, inverted the termination condition so that it instead
of discarding the nodes, they were all declared young enough and
eligible for reuse.
Fixes: 06b73c2d0b ("drm/i915/gt: Delay taking the spinlock for grabbing from the buffer pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729110756.2344-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Joonas: Updating Fixes: link after rebasing and reordering into drm-intel-gt-next]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Before we peek at the barrier status for an assert, first serialise with
its callbacks so that we see a stable value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728153325.28351-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Some very low hanging fruit, but contention on the pool->lock is
noticeable between intel_gt_get_buffer_pool() and pool_retire(), with
the majority of the hold time due to the locked list iteration. If we
make the node itself RCU protected, we can perform the search for an
suitable node just under RCU, reserving taking the lock itself for
claiming the node and manipulating the list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729080245.8070-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Unlike rcs where we have conclusive evidence from our selftesting that
disabling the preparser before performing the TLB invalidate and
relocations does impact upon the GPU execution, the evidence for the
same requirement on xcs is much more circumstantial. Let's apply the
preparser disable between batches as we invalidate the TLB as a dose of
healthy paranoia, just in case.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/2169
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152110.830-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
I915_GEM_THROTTLE dates back to the time before contexts where there was
just a single engine, and therefore a single timeline and request list
globally. That request list was in execution/retirement order, and so
walking it to find a particular aged request made sense and could be
split per file.
That is no more. We now have many timelines with a file, as many as the
user wants to construct (essentially per-engine, per-context). Each of
those run independently and so make the single list futile. Remove the
disordered list, and iterate over all the timelines to find a request to
wait on in each to satisfy the criteria that the CPU is no more than 20ms
ahead of its oldest request.
It should go without saying that the I915_GEM_THROTTLE ioctl is no
longer used as the primary means of throttling, so it makes sense to push
the complication into the ioctl where it only impacts upon its few
irregular users, rather than the execbuf/retire where everybody has to
pay the cost. Fortunately, the few users do not create vast amount of
contexts, so the loops over contexts/engines should be concise.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728152010.30701-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We include a tasklet flush before waiting on a request as a precaution
against the HW being lax in event signaling. We now have a precautionary
flush in the engine's heartbeat and so do not need to be quite so
zealous on every request wait. If we focus on the request, the only
tasklet flush that matters is if there is a delay in submitting this
request to HW, so if the request is not ready to be executed, no
advantage in reducing this wait can be gained by running the tasklet.
And there is little point in doing busy work for no result.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200715115147.11866-10-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Currently, we use i915_request_completed() directly in
i915_request_wait() and follow up with a manual invocation of
dma_fence_signal(). This appears to cause a large number of contentions
on i915_request.lock as when the process is woken up after the fence is
signaled by an interrupt, we will then try and call dma_fence_signal()
ourselves while the signaler is still holding the lock.
dma_fence_is_signaled() has the benefit of checking the
DMA_FENCE_FLAG_SIGNALED_BIT prior to calling dma_fence_signal() and so
avoids most of that contention.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716100754.5670-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
- Introduce a mechanism to extend execbuf2 (Lionel)
- Add syncobj timeline support (Lionel)
Driver Changes:
- Limit stolen mem usage on the compressed frame buffer (Ville)
- Some clean-up around display's cdclk (Ville)
- Some DDI changes for better DP link training according
to spec (Imre)
- Provide the perf pmu.module (Chris)
- Remove dobious Valleyview PCI IDs (Alexei)
- Add new display power saving feature for gen12+ called
HOBL (Jose)
- Move SKL's clock gating w/a to skl_init_clock_gating() (Ville)
- Rocket Lake display additions (Matt)
- Selftest: temporarily downgrade on severity of frequency
scaling tests (Chris)
- Introduce a new display workaround for fixing FLR related
issues on new PCH. (Jose)
- Temporarily disable FBC on TGL. It was the culprit of random
underruns. (Uma).
- Copy default modparams to mock i915_device (Chris)
- Add compiler paranoia for checking HWSP values (Chris)
- Remove useless gen check before calling intel_rps_boost (Chris)
- Fix a null pointer dereference (Chris)
- Add a couple of missing i915_active_fini() (Chris)
- Update TGL display power's bw_buddy table according to
update spec (Matt)
- Fix couple wrong return values (Tianjia)
- Selftest: Avoid passing random 0 into ilog2 (George)
- Many Tiger Lake display fixes and improvements for Type-C and
DP compliance (Imre, Jose)
- Start the addition of PSR2 selective fetch (Jose)
- Update a few DMC and HuC firmware versions (Jose)
- Add gen11+ w/a to fix underuns (Matt)
- Fix cmd parser desc matching with mask (Mika)
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAl9EBoQACgkQ+mJfZA7r
E8qNIgf9E3t12hq0z+8SidMyUPXCz6+BJzed+zjF6q6w3lVaxloQbJbQc/ujec6Y
DcnHKdZWN4/BjPtO9PYsOo7JRPlw9mounMfMqhmsgCNigpy8jdE6EQB2wDY/JtWG
I/OmVwaIDWF/srRJZNJlmdx1IT6pes3A/1HBJmJWFFPFFQxl6Y8vbaZGmMDwXRzS
6/LOy7otXVGvSHqYDFzNWBPNRstUYmQuPbE4/Iei3zbS8Di3uCkspa6LbocE+T5g
cokw9fxE1cJv9bIhIY65R611XyzqqHDzM+2s3x35r8a/ectItLE7kkU07/X3RXmc
lrqf4xxzmg+lvbKaLMGdI7YRFPcvbQ==
=cGnK
-----END PGP SIGNATURE-----
Merge tag 'drm-intel-next-2020-08-24-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Introduce a mechanism to extend execbuf2 (Lionel)
- Add syncobj timeline support (Lionel)
Driver Changes:
- Limit stolen mem usage on the compressed frame buffer (Ville)
- Some clean-up around display's cdclk (Ville)
- Some DDI changes for better DP link training according
to spec (Imre)
- Provide the perf pmu.module (Chris)
- Remove dobious Valleyview PCI IDs (Alexei)
- Add new display power saving feature for gen12+ called
HOBL (Jose)
- Move SKL's clock gating w/a to skl_init_clock_gating() (Ville)
- Rocket Lake display additions (Matt)
- Selftest: temporarily downgrade on severity of frequency
scaling tests (Chris)
- Introduce a new display workaround for fixing FLR related
issues on new PCH. (Jose)
- Temporarily disable FBC on TGL. It was the culprit of random
underruns. (Uma).
- Copy default modparams to mock i915_device (Chris)
- Add compiler paranoia for checking HWSP values (Chris)
- Remove useless gen check before calling intel_rps_boost (Chris)
- Fix a null pointer dereference (Chris)
- Add a couple of missing i915_active_fini() (Chris)
- Update TGL display power's bw_buddy table according to
update spec (Matt)
- Fix couple wrong return values (Tianjia)
- Selftest: Avoid passing random 0 into ilog2 (George)
- Many Tiger Lake display fixes and improvements for Type-C and
DP compliance (Imre, Jose)
- Start the addition of PSR2 selective fetch (Jose)
- Update a few DMC and HuC firmware versions (Jose)
- Add gen11+ w/a to fix underuns (Matt)
- Fix cmd parser desc matching with mask (Mika)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826232733.GA129053@intel.com
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- ttm: various cleanups and reworks of the API
Driver Changes:
- ast: various cleanups
- gma500: A few fixes, conversion to GPIOd API
- hisilicon: Change of maintainer, various reworks
- ingenic: Clock handling and formats support improvements
- mcde: improvements to the DSI support
- mgag200: Support G200 desktop cards
- mxsfb: Support the i.MX7 and i.MX8M and the alpha plane
- panfrost: support devfreq
- ps8640: Retrieve the EDID from eDP control, misc improvements
- tidss: Add a workaround for AM65xx YUV formats handling
- virtio: a few cleanups, support for virtio-gpu exported resources
- bridges: Support the chained bridges on more drivers,
new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611
- panels: Convert to dev_ based logging, read orientation from the DT,
various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002,
Powertip PH800480T013, KingDisplay KD116N21-30NV-A010
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCX0fXGwAKCRDj7w1vZxhR
xTmMAQDPmfSsBLLNnDxu4++zFrQ7OKmNSHCkVr4nAQ/yg3GVPQEAuRw6qPwPWuV3
+jEPxaQSSmHOhx/jXfolV1tJaE/FHgA=
=WYoO
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-08-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.10:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- ttm: various cleanups and reworks of the API
Driver Changes:
- ast: various cleanups
- gma500: A few fixes, conversion to GPIOd API
- hisilicon: Change of maintainer, various reworks
- ingenic: Clock handling and formats support improvements
- mcde: improvements to the DSI support
- mgag200: Support G200 desktop cards
- mxsfb: Support the i.MX7 and i.MX8M and the alpha plane
- panfrost: support devfreq
- ps8640: Retrieve the EDID from eDP control, misc improvements
- tidss: Add a workaround for AM65xx YUV formats handling
- virtio: a few cleanups, support for virtio-gpu exported resources
- bridges: Support the chained bridges on more drivers,
new bridges: Toshiba TC358762, Toshiba TC358775, Lontium LT9611
- panels: Convert to dev_ based logging, read orientation from the DT,
various fixes, new panels: Mantix MLAF057WE51-X, Chefree CH101OLHLWH-002,
Powertip PH800480T013, KingDisplay KD116N21-30NV-A010
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200827155517.do6emeacetpturli@gilmour.lan
Our variety of defined gpu commands have the actual
command id field and possibly length and flags applied.
We did start to apply the mask during initialization of
the cmd descriptors but forgot to also apply it on comparisons.
Fix comparisons in order to properly deny access with
associated commands.
v2: fix lri with correct mask (Chris)
References: 926abff21a ("drm/i915/cmdparser: Ignore Length operands during command matching")
Reported-by: Nicolai Stange <nstange@suse.de>
Cc: stable@vger.kernel.org # v5.4+
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Tyler Hicks <tyhicks@canonical.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200817195926.12671-1-mika.kuoppala@linux.intel.com
Add minimum width to planes, variable with specific formats for gen11+
to reflect recent bspec changes.
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200812210702.7153-1-matthew.s.atwood@intel.com
The dependency between power wells is determined by the ordering of the
power well list: when enabling the power wells for a domain, this
happens walking the power well list forward, while disabling them
happens in the reverse direction. Accordingly a power well on the list
must follow any other power well it depends on.
Since the TC AUX power wells depend on TC-cold being blocked, move the
TC-cold off power well before all AUX power wells.
Fixes: 3c02934b24 ("drm/i915/tc/tgl: Implement TC cold sequences")
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200720232952.16228-1-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit b302a2e688)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
igt_mm_config() calls ilog2() on the (pseudo)random 21-bit number
s>>12. Once in 2 million seeds, this is zero and ilog2 summons
the nasal demons.
There was an attempt to handle this case with a max(), but that's
too late; ms could already be something bizarre.
Given that the low 12 bits of s and ms are always zero, it's a lot
simpler just to divide them by 4096, then everything fits into 32
bits, and we can easily generate a random number 1 <= s <= 0x1fffff.
Fixes: 14d1b9a624 ("drm/i915: buddy allocator")
Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325192429.GA8865@SDF.ORG
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 21118e8e56)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
In the case of calling check_digital_port_conflicts() failed, a
negative error code -EINVAL should be returned.
Fixes: bf5da83e4b ("drm/i915: Move check_digital_port_conflicts() earier")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200802111535.5200-1-tianjia.zhang@linux.alibaba.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 66b51b801d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
A recent bspec update removed the LPDDR4 single channel entry from the
buddy register table, but added a new four-channel entry.
Workaround 1409767108 hasn't been updated with any guidance for four
channel configurations, so we leave that alternate table unchanged for
now.
Bspec 49218
Fixes: 3fa01d642f ("drm/i915/tgl: Program BW_BUDDY registers during display init")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612204734.3674650-1-matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit ecb40d0826)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Since we use the module parameters stored inside the drm_i915_device
itself, we need to ensure the mock i915_device also sets up the right
defaults.
Fixes: 8a25c4be58 ("drm/i915/params: switch to device specific parameters")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728150600.4509-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 98ef067453)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Rather than manually implement our own module reference counting for perf
pmu events, finally realise that there is a module parameter to struct
pmu for this very purpose.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716094643.31410-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 27e897beec)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
From the 3 WAs for PSR2 man track/selective fetch this is only one
needed when doing single full frames at every flip.
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200810174144.76761-2-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
All GEN12 platforms supports PSR2 selective fetch but not all GEN12
platforms supports PSR2 hardware tracking(aka RKL).
This feature consists in software programming registers with the
damaged area of each plane this way hardware will only fetch from
memory those areas and sent the PSR2 selective update blocks to panel,
saving even more power.
But as initial step it is only enabling the full frame fetch at
every flip, the actual selective fetch part will come in a future
patch.
Also this is only handling the page flip side, it is still completely
missing frontbuffer modifications, that is why the
enable_psr2_sel_fetch parameter was added.
v3:
- calling intel_psr2_sel_fetch_update() during the atomic check phase
(Ville)
BSpec: 55229
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200810174144.76761-1-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We usually assume that increasing PCI device revision ID's translates to
newer steppings; macros like IS_KBL_REVID() that we use rely on this
behavior. Unfortunately this turns out to not be true on KBL; the
newer device 2 revision ID's sometimes go backward to older steppings.
The situation is further complicated by different GT and display
steppings associated with each revision ID.
Let's work around this by providing a table to map the revision ID to
specific GT and display steppings, and then perform our comparisons on
the mapped values.
v2:
- Move the kbl_revids[] array to intel_workarounds.c to avoid compiler
warnings about an unused variable in files that don't call the
macros (kernel test robot).
Bspec: 18329
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200811032105.2819370-1-matthew.d.roper@intel.com
Reviewed-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
This new HBR2 table for TGL-U and TGL-Y is required to pass
DisplayPort compliance.
BSpec: 49291
Cc: Khaled Almahallawy <khaled.almahallawy@intel.com>
Reviewed-by: Khaled Almahallawy<khaled.almahallawy@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200807192629.64134-2-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
There is no way to differentiate TGL-U from TGL-Y by the PCI ids as
some ids are available in both SKUs.
So here using the root device id in the PCI bus that iGPU is in
to differentiate between U and Y.
BSpec: 44455
Reviewed-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200807192629.64134-1-jose.souza@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The command register is the PCODE MBOX low register not the high one as
described by the spec. This left the system with the TC-cold power state
being blocked all the time. Fix things by using the correct register.
Also to make sure we retry a request for at least 600usec, when the
PCODE MBOX command itself succeeded, but the TC-cold block command
failed, sleep for 1msec unconditionally after any fail.
The change was tested with JTAG register read of the HW/FW's actual
TC-cold state, which reported the expected states after this change.
Tested-by: Nivedita Swaminathan <nivedita.swaminathan@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200805150056.24248-1-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The dependency between power wells is determined by the ordering of the
power well list: when enabling the power wells for a domain, this
happens walking the power well list forward, while disabling them
happens in the reverse direction. Accordingly a power well on the list
must follow any other power well it depends on.
Since the TC AUX power wells depend on TC-cold being blocked, move the
TC-cold off power well before all AUX power wells.
Fixes: 3c02934b24 ("drm/i915/tc/tgl: Implement TC cold sequences")
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200720232952.16228-1-imre.deak@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
igt_mm_config() calls ilog2() on the (pseudo)random 21-bit number
s>>12. Once in 2 million seeds, this is zero and ilog2 summons
the nasal demons.
There was an attempt to handle this case with a max(), but that's
too late; ms could already be something bizarre.
Given that the low 12 bits of s and ms are always zero, it's a lot
simpler just to divide them by 4096, then everything fits into 32
bits, and we can easily generate a random number 1 <= s <= 0x1fffff.
Fixes: 14d1b9a624 ("drm/i915: buddy allocator")
Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325192429.GA8865@SDF.ORG
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Introduces a new parameters to execbuf so that we can specify syncobj
handles as well as timeline points.
v2: Reuse i915_user_extension_fn
v3: Check that the chained extension is only present once (Chris)
v4: Check that dma_fence_chain_find_seqno returns a non NULL fence (Lionel)
v5: Use BIT_ULL (Chris)
v6: Fix issue with already signaled timeline points,
dma_fence_chain_find_seqno() setting fence to NULL (Chris)
v7: Report ENOENT with invalid syncobj handle (Lionel)
v8: Check for out of order timeline point insertion (Chris)
v9: After explanations on
https://lists.freedesktop.org/archives/dri-devel/2019-August/229287.html
drop the ordering check from v8 (Lionel)
v10: Set first extension enum item to 1 (Jason)
v11: Rebase
v12: Allow multiple extension nodes of timeline syncobj (Chris)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Co-authored-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v11)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-3-lionel.g.landwerlin@intel.com
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
We're planning to use this for a couple of new feature where we need
to provide additional parameters to execbuf.
v2: Check for invalid flags in execbuffer2 (Lionel)
v3: Rename I915_EXEC_EXT -> I915_EXEC_USE_EXTENSIONS (Chris)
v4: Rebase
Move array fence parsing in i915_gem_do_execbuffer()
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804085954.350343-2-lionel.g.landwerlin@intel.com
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2901
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The hardware team has dropped this workaround from the bspec; it is no
longer needed.
This reverts commit 111822b21be995a3a4a731066db3d820523c57f7.
Bspec: 49291
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200804044024.1931170-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In the case of calling check_digital_port_conflicts() failed, a
negative error code -EINVAL should be returned.
Fixes: bf5da83e4b ("drm/i915: Move check_digital_port_conflicts() earier")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200802111535.5200-1-tianjia.zhang@linux.alibaba.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In function i915_active_acquire_preallocate_barrier(), not all
paths have the return value set correctly, and in case of memory
allocation failure, a negative error code should be returned.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200802115655.25568-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
A recent bspec update removed the LPDDR4 single channel entry from the
buddy register table, but added a new four-channel entry.
Workaround 1409767108 hasn't been updated with any guidance for four
channel configurations, so we leave that alternate table unchanged for
now.
Bspec 49218
Fixes: 3fa01d642f ("drm/i915/tgl: Program BW_BUDDY registers during display init")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612204734.3674650-1-matthew.d.roper@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>