Commit Graph

575 Commits

Author SHA1 Message Date
Thomas Daniel
05f0addd9b drm/i915/icl: Enhanced execution list support
Enhanced Execlists is an upgraded version of execlists which supports
up to 8 ports. The lrcs to be submitted are written to a submit queue
(the ExecLists Submission Queue - ELSQ), which is then loaded on the
HW. When writing to the ELSP register, the lrcs are written cyclically
in the queue from position 0 to position 7. Alternatively, it is
possible to write directly in the individual positions of the queue
using the ELSQC registers. To be able to re-use all the existing code
we're using the latter method and we're currently limiting ourself to
only using 2 elements.

v2: Rebase.
v3: Switch from !IS_GEN11 to GEN < 11 (Daniele Ceraolo Spurio).
v4: Use the elsq registers instead of elsp. (Daniele Ceraolo Spurio)
v5: Reword commit, rename regs to be closer to specs, turn off
    preemption (Daniele), reuse engine->execlists.elsp (Chris)
v6: use has_logical_ring_elsq to differentiate the new paths
v7: add preemption support, rename els to submit_reg (Chris)
v8: save the ctrl register inside the execlists struct, drop CSB
    handling updates (superseded by preempt_complete_status) (Chris)
v9: s/drm_i915_gem_request/i915_request (Mika)
v10: resolved conflict in inject_preempt_context (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-4-mika.kuoppala@linux.intel.com
2018-03-07 15:07:31 +02:00
Daniele Ceraolo Spurio
ac52da6af8 drm/i915/icl: new context descriptor support
Starting from Gen11 the context descriptor format has been updated in
the HW. The hw_id field has been considerably reduced in size and engine
class and instance fields have been added.

There is a slight name clashing issue because the field that we call
hw_id is actually called SW Context ID in the specs for Gen11+.

With the current size of the hw_id field we can have a maximum of 2k
contexts at any time, but we could use the sw_counter field (which is sw
defined) to increase that because the HW requirement is that
engine_id + sw id + sw_counter is a unique number.
GuC uses a similar method to support more contexts but does its tracking
at lrc level. To avoid doing an implementation that will need to be
reworked once GuC support lands, defer it for now and mark it as TODO.

v2: rebased, add documentation, fix GEN11_ENGINE_INSTANCE_SHIFT
v3: rebased, bring back lost code from i915_gem_context.c
v4: make TODO comment more generic
v5: be consistent with bit ordering, add extra checks (Chris)

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-3-mika.kuoppala@linux.intel.com
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-07 15:07:20 +02:00
Tvrtko Ursulin
d4ccceb055 drm/i915/icl: Ringbuffer interrupt handling
On Gen11 interrupt masks need to be clear to allow C6 entry.
We keep them all enabled knowing that we generate extra
interrupts.

v2: Rebase.
v3: Remove gen 11 extra check in logical_render_ring_init.
v4: Rebase fixes.
v5: Rebase/refactor.
v6: Rebase.
v7: Rebase.
v8: Update comment and commit message (Daniele)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302161501.28594-1-mika.kuoppala@linux.intel.com
2018-03-05 16:26:28 +02:00
Chris Wilson
a3e3883646 drm/i915/execlists: Split spinlock from its irq disabling side-effect
During reset/wedging, we have to clean up the requests on the timeline
and flush the pending interrupt state. Currently, we are abusing the irq
disabling of the timeline spinlock to protect the irq state in
conjunction to the engine's timeline requests, but this is accidental
and conflates the spinlock with the irq state. A baffling state of
affairs for the reader.

Instead, explicitly disable irqs over the critical section, and separate
modifying the irq state from the timeline's requests.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302143246.2579-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-03-02 23:11:12 +00:00
Chris Wilson
aebbc2d7b3 drm/i915/execlists: Move irq state manipulation inside irq disabled region
Although this state (execlists->active and engine->irq_posted) itself is
not protected by the engine->timeline spinlock, it does conveniently
ensure that irqs are disabled. We can use this to protect our
manipulation of the state and so ensure that the next IRQ to arrive sees
consistent state and (hopefully) ignores the reset engine.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302131246.22036-1-chris@chris-wilson.co.uk
2018-03-02 23:11:11 +00:00
Chris Wilson
963ddd63c3 drm/i915: Suspend submission tasklets around wedging
After staring hard at sequences like

[   28.199013]  systemd-1       2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
[   28.199095]  systemd-1       2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
[   28.199177]  systemd-1       2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
[   28.199258]  systemd-1       2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
[   28.199340]  gem_eio-829     1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.1, seqno=1, prio=0
[   28.199421]   <idle>-0       2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
[   28.199503]   <idle>-0       2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
[   28.199585]  gem_eio-829     1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]:  ctx=3.1, seqno=2, prio=0
[   28.199667]  gem_eio-829     1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]:  ctx=1.2, seqno=1, prio=0
[   28.199749]   <idle>-0       2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
[   28.199830]   <idle>-0       2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
[   28.199912]   <idle>-0       2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
[   28.199994]  gem_eio-829     2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
[   28.200096]  gem_eio-829     2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
[   28.200178]  gem_eio-829     2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
[   28.200260]  gem_eio-829     2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)

the conclusion is that the only place where the ports are reset to zero,
is from engine->cancel_requests called during i915_gem_set_wedged().

The race is horrible as it results from calling set-wedged on active HW
(the GPU reset failed) and as such we need to be careful as the HW state
changes beneath us. Fortunately, it's the same scary conditions as
affect normal reset, so we can reuse the same machinery to disable state
tracking as we clobber it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Fixes: af7a8ffad9 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk
2018-03-02 23:11:11 +00:00
Chris Wilson
f6322eddaf drm/i915/preemption: Allow preemption between submission ports
Sometimes we need to boost the priority of an in-flight request, which
may lead to the situation where the second submission port then contains
a higher priority context than the first and so we need to inject a
preemption event. To do so we must always check inside
execlists_dequeue() whether there is a priority inversion between the
ports themselves as well as the head of the priority sorted queue, and we
cannot just skip dequeuing if the queue is empty.

As Michał noted, this doesn't simply extend to handling more than 2-port
submission, as we may need to reorder within the array of executing
requests which themselves are lower priority than the first. A task for
later!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222142229.14517-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2018-02-23 16:37:40 +00:00
Chris Wilson
e084039b58 drm/i915/execlists: Move the GEM_BUG_ON context matches CSB later
Print out the current request/context before doing the GEM_BUG_ON, so
that we can inspect the values in the ftrace.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221152301.9178-1-chris@chris-wilson.co.uk
2018-02-22 09:08:39 +00:00
Chris Wilson
65cb8c0f04 drm/i915/execlists: Add a GEM_TRACE to show when the context is completed
Include a GEM_TRACE to show when the context is complete and we advance
the ELSP port.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221151553.9054-1-chris@chris-wilson.co.uk
2018-02-22 09:08:21 +00:00
Chris Wilson
561210706c drm/i915/execlists: Remove the ring advancement under preemption
Load an empty ringbuffer for preemption, ignoring the lite-restore
workaround as we know the preempt context is always idle before preemption.

Note that after some digging by Michal Winiarski, we found that
RING_HEAD is no longer being updated (due to inhibiting context save
restore) so this patch is already in effect!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221133236.29402-1-chris@chris-wilson.co.uk
2018-02-21 20:57:47 +00:00
Chris Wilson
e61e0f51ba drm/i915: Rename drm_i915_gem_request to i915_request
We want to de-emphasize the link between the request (dependency,
execution and fence tracking) from GEM and so rename the struct from
drm_i915_gem_request to i915_request. That is we may implement the GEM
user interface on top of requests, but they are an abstraction for
tracking execution rather than an implementation detail of GEM. (Since
they are not tied to HW, we keep the i915 prefix as opposed to intel.)

In short, the spatch:
@@

@@
- struct drm_i915_gem_request
+ struct i915_request

A corollary to contracting the type name, we also harmonise on using
'rq' shorthand for local variables where space if of the essence and
repetition makes 'request' unwieldy. For globals and struct members,
'request' is still much preferred for its clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-02-21 20:57:22 +00:00
Chris Wilson
d081e021fd drm/i915/execlists: Remove too early assert
We can't assert that the execlists are active before we set the flag. So
perform the assert after we are expected to have marked the execlists
active.

Fixes: 339ccd35b4 ("drm/i915: Assert that we always complete a submission to guc/execlists")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180216153210.30551-1-chris@chris-wilson.co.uk
2018-02-16 16:18:32 +00:00
Chris Wilson
339ccd35b4 drm/i915: Assert that we always complete a submission to guc/execlists
The continual resubmission model for execlists (and emulated over guc)
requires that we keep feeding requests into the HW in order to generate
more CS interrupts to drain the rest of the queue. Add a couple of
asserts to ensure that we don't skip a cycle and come to a grinding
halt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180215162553.23348-1-chris@chris-wilson.co.uk
2018-02-16 14:14:14 +00:00
Chris Wilson
d637637491 drm/i915: Only allocate preempt context when required
If we remove some hardcoded assumptions about the preempt context having
a fixed id, reserved from use by normal user contexts, we may only
allocate the i915_gem_context when required. Then the subsequent
decisions on using preemption reduce to having the preempt context
available.

v2: Include an assert that we don't allocate the preempt context twice.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-3-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-02-08 07:30:16 +00:00
Chris Wilson
3fed180812 drm/i915: Move the scheduler feature bits into the purview of the engines
Rather than having the high level ioctl interface guess the underlying
implementation details, having the implementation declare what
capabilities it exports. We define an intel_driver_caps, similar to the
intel_device_info, which instead of trying to describe the HW gives
details on what the driver itself supports. This is then populated by
the engine backend for the new scheduler capability field for use
elsewhere.

v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika)
    One less assumption of engine[RCS] \o/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tomasz Lis <tomasz.lis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-02-08 07:30:11 +00:00
Rafael Antognolli
4b6ce6810a drm/i915/cnl: WaPipeControlBefore3DStateSamplePattern
This workaround should prevent a bug that can be hit on a context
restore. To avoid the issue, we must emit a PIPE_CONTROL with CS stall
(0x7a000004 0x00100000 0x00000000 0x00000000) followed by 12DW's of
NOOP(0x0) in the indirect context batch buffer, to ensure the engine is
idle prior to programming 3DSTATE_SAMPLE_PATTERN.

It's also not clear whether we should add those extra dwords because of
the workaround itself, or if that's just padding for the WA BB (and next
commands could come right after the PIPE_CONTROL). We keep them for now.

References: HSD#1939868

 v2: More descriptive changelog and comments.
 v3: Explain that PIPE_CONTROL is actually 6 dwords, and that we advance
     10 more dwords because of that.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205233330.14973-1-rafael.antognolli@intel.com
2018-02-06 08:59:39 +00:00
Chris Wilson
e840130a25 drm/i915/execlists: Move the reset bits to a more natural home
In preparation for the next patch, we want the engine to appear idle
after a reset (if there are no requests in flight). For execlists, this
entails clearing the active status on reset, it will be regenerated on
restarting the engine after the reset. In the process, note that a
couple of other status flags and checks could be moved into the
describing function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205152431.12163-3-chris@chris-wilson.co.uk
2018-02-05 15:27:25 +00:00
Chris Wilson
073988d102 drm/i915/execlists: Remove the startup spam
Execlists is now enabled by default and included in the list of
capabilities printed out to dmesg and beyond. We do not need to mention
it again, every time we restart the engine, so kill the spam.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180205092201.19476-6-chris@chris-wilson.co.uk
2018-02-05 13:24:14 +00:00
Chris Wilson
274de87606 drm/i915/execlists: Flush GTIIR on clearing CS interrupts during reset
Be paranoid and flush the GTIIR after clearing the CS interrupt to be
sure it has taken before we re-enable the interrupt handler. We still
see early interrupts following reset, the tasklet handling the mmio read
before it has been written by the CS. This hopefully reduces the
frequency to 0...

References: https://bugs.freedesktop.org/show_bug.cgi?id=104262
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180202145455.29876-1-chris@chris-wilson.co.uk
2018-02-02 20:31:52 +00:00
Kelvin Gardiner
225701fc20 drm/i915/icl: Set graphics mode register for gen11
This patch clears a single bit. The bit is 0 by default but expected
not to be set. Explicitly clearing the bit in this patch is intended
to indicate some thinking has occurred, and that we want this bit
cleared and we are not just excepting the default value.

We also stop setting GFX_RUN_LIST_ENABLE, which is correct since that
bit is gone.

v2 (from Paulo): fix indentation.
v3: Changed GEN check to >= 11. Corrected author name.
v4 (from Paulo): improve commit message (Daniele).

Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Kelvin Gardiner <kelvin.gardiner@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130134918.32283-9-paulo.r.zanoni@intel.com
2018-01-31 14:29:52 -02:00
Chris Wilson
1d2a19c256 drm/i915/lrc: Remove superfluous WARN_ON
Remove the WARN_ON(ce->state) inside the static function only called
when ce->state == NULL and downgrade the w/a batch setup warning into a
developer only mode (GEM_WARN_ON).

v2: Move the deferred alloc guard into the callee, eliminating the need
for the WARN_ON:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-1 (-1)
Function                                     old     new   delta
execlists_context_pin                       1819    1818      -1

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180126121846.12007-1-chris@chris-wilson.co.uk
2018-01-26 13:03:07 +00:00
Chris Wilson
09b1a4e4b5 drm/i915/lrc: Clear context restore/save inhibit flags for new contexts
CTX_CONTEXT_CONTROL (CTX_SR_CTL) operates as a masked register and so
will only apply the bits that are selected by the upper half. In the
case of selectively enabling sr inhibit, this may mean the context keeps
the current setting (so forgetting to save the context later, eventually
leading to a very upset GPU!).

Fixes: 517aaffe0c ("drm/i915/execlists: Inhibit context save/restore for the fake preempt context")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180125112443.12745-1-chris@chris-wilson.co.uk
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
2018-01-25 18:04:25 +00:00
Chris Wilson
517aaffe0c drm/i915/execlists: Inhibit context save/restore for the fake preempt context
We only use the preempt context to inject an idle point into execlists.
We never need to reference its logical state, so tell the GPU never to
load it or save it.

v2: BIT(2) for save-inhibit.

N.B. Daniele mentioned this bit mbz for ICL, and has been moved into the
submission process rather than the context image.

Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180123210412.17653-1-chris@chris-wilson.co.uk
2018-01-24 09:40:15 +00:00
Michel Thierry
578f1ac689 drm/i915: Move LRC register offsets to a header file
Newer platforms may have subtle offset changes, which will increase the
number of defines, so it is probably better to start moving them to its
own header file. Also move the macros used while setting the reg state.

v2: Rename to intel_lrc_reg.h, to be consistent with i915_reg.h and
intel_guc_reg.h (Chris)
v3: License notice shenanigans.
v4: Documentation/process/coding-style.rst is always right (Chris)
v5: Rebase.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180124004349.22126-2-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-24 09:22:55 +00:00
Michel Thierry
751d115302 drm/i915/lrc: Update reg_state macros to pass checkpatch
The macros we use to init the reg_state had the following issues reported
by checkpatch --strict.

  Macro argument reuse 'reg_state' - possible side-effects
  Macro argument reuse 'pos' - possible side-effects
  Macro argument reuse 'ppgtt' - possible side-effects
  spaces preferred around that '+' (ctx:VxV)

So fix these issues before they are moved to a new header file.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180124004349.22126-1-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2018-01-24 09:19:38 +00:00
Chris Wilson
bb5db7e160 drm/i915/execlists: Skip forcewake for ELSP submission
Now that we can read the CSB from the HWSP, we may avoid having to
perform mmio reads entirely and so forgo the rigmarole of the forcewake
dance.

v2: Include forcewake hint for GEM_TRACE readback of mmio. If we don't
hold fw ourselves, the reads may return garbage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180122100714.15137-1-chris@chris-wilson.co.uk
2018-01-22 18:27:04 +00:00
Tvrtko Ursulin
10bde236ef drm/i915: Per-engine scratch VMA is mandatory
We fail engine initialization if the scratch VMA cannot be created so
there is no point in error handle it later. If the initialization ordering
gets messed up, we can explode during development just as well.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180119100005.9072-2-tvrtko.ursulin@linux.intel.com
2018-01-22 17:15:31 +00:00
Tvrtko Ursulin
ae504be2e0 drm/i915: Downgrade incorrect engine constructor usage warnings to development
Render engine constructor helpers must only be called from the render
engine constructors, but there is no need to burden the production
binaries with warnings which can only be triggered during development.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180119100005.9072-1-tvrtko.ursulin@linux.intel.com
2018-01-22 17:15:20 +00:00
Chris Wilson
c218ee03b9 drm/i915: Don't adjust priority on an already signaled fence
When we retire a signaled fence, we free the dependency tree. However,
we skip clearing the list so that if we then try to adjust the priority
of the signaled fence, we may walk the list of freed dependencies.

[ 3083.156757] ==================================================================
[ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915]
[ 3083.156810] Read of size 8 at addr ffff8806bf20f400 by task Xorg/831

[ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1
[ 3083.156817] Hardware name: Notebook                         N24_25BU/N24_25BU, BIOS 5.12 02/17/2017
[ 3083.156818] Call Trace:
[ 3083.156823]  dump_stack+0x5c/0x7a
[ 3083.156827]  print_address_description+0x6b/0x290
[ 3083.156830]  kasan_report+0x28f/0x380
[ 3083.156872]  ? execlists_schedule+0x199/0x660 [i915]
[ 3083.156914]  execlists_schedule+0x199/0x660 [i915]
[ 3083.156956]  ? intel_crtc_atomic_check+0x146/0x4e0 [i915]
[ 3083.156997]  ? execlists_submit_request+0xe0/0xe0 [i915]
[ 3083.157038]  ? i915_vma_misplaced.part.4+0x25/0xb0 [i915]
[ 3083.157079]  ? __i915_vma_do_pin+0x7c8/0xc80 [i915]
[ 3083.157121]  ? intel_atomic_state_alloc+0x44/0x60 [i915]
[ 3083.157130]  ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper]
[ 3083.157145]  ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
[ 3083.157159]  ? drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157172]  ? drm_ioctl+0x45b/0x560 [drm]
[ 3083.157211]  i915_gem_object_wait_priority+0x14c/0x2c0 [i915]
[ 3083.157251]  ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915]
[ 3083.157290]  ? i915_vma_pin_fence+0x1d8/0x320 [i915]
[ 3083.157331]  ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915]
[ 3083.157372]  ? intel_rotation_info_size+0x60/0x60 [i915]
[ 3083.157413]  ? intel_link_compute_m_n+0x80/0x80 [i915]
[ 3083.157428]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
[ 3083.157443]  ? drm_dev_printk+0x1b0/0x1b0 [drm]
[ 3083.157485]  intel_prepare_plane_fb+0x2f8/0x5a0 [i915]
[ 3083.157527]  ? intel_crtc_get_vblank_counter+0x80/0x80 [i915]
[ 3083.157536]  drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper]
[ 3083.157587]  intel_atomic_commit+0x12e/0x4e0 [i915]
[ 3083.157605]  drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper]
[ 3083.157621]  drm_mode_page_flip_ioctl+0x7d2/0x850 [drm]
[ 3083.157638]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157652]  ? drm_lease_owner+0x1a/0x30 [drm]
[ 3083.157668]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157681]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157696]  drm_ioctl+0x45b/0x560 [drm]
[ 3083.157711]  ? drm_mode_cursor2_ioctl+0x10/0x10 [drm]
[ 3083.157725]  ? drm_getstats+0x20/0x20 [drm]
[ 3083.157729]  ? timerqueue_del+0x49/0x80
[ 3083.157732]  ? __remove_hrtimer+0x62/0xb0
[ 3083.157735]  ? hrtimer_try_to_cancel+0x173/0x210
[ 3083.157738]  do_vfs_ioctl+0x13b/0x880
[ 3083.157741]  ? ioctl_preallocate+0x140/0x140
[ 3083.157744]  ? _raw_spin_unlock_irq+0xe/0x30
[ 3083.157746]  ? do_setitimer+0x234/0x370
[ 3083.157750]  ? SyS_setitimer+0x19e/0x1b0
[ 3083.157752]  ? SyS_alarm+0x140/0x140
[ 3083.157755]  ? __rcu_read_unlock+0x66/0x80
[ 3083.157757]  ? __fget+0xc4/0x100
[ 3083.157760]  SyS_ioctl+0x74/0x80
[ 3083.157763]  entry_SYSCALL_64_fastpath+0x1a/0x7d
[ 3083.157765] RIP: 0033:0x7f6135d0c6a7
[ 3083.157767] RSP: 002b:00007fff01451888 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
[ 3083.157769] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6135d0c6a7
[ 3083.157771] RDX: 00007fff01451950 RSI: 00000000c01864b0 RDI: 000000000000000c
[ 3083.157772] RBP: 00007f613076f600 R08: 0000000000000001 R09: 0000000000000000
[ 3083.157773] R10: 0000000000000060 R11: 0000000000003246 R12: 0000000000000000
[ 3083.157774] R13: 0000000000000060 R14: 000000000000001b R15: 0000000000000060

[ 3083.157779] Allocated by task 831:
[ 3083.157783]  kmem_cache_alloc+0xc0/0x200
[ 3083.157822]  i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915]
[ 3083.157861]  i915_gem_request_await_object+0x321/0x370 [i915]
[ 3083.157900]  i915_gem_do_execbuffer+0x1165/0x19c0 [i915]
[ 3083.157937]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
[ 3083.157950]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.157962]  drm_ioctl+0x45b/0x560 [drm]
[ 3083.157964]  do_vfs_ioctl+0x13b/0x880
[ 3083.157966]  SyS_ioctl+0x74/0x80
[ 3083.157968]  entry_SYSCALL_64_fastpath+0x1a/0x7d

[ 3083.157971] Freed by task 831:
[ 3083.157973]  kmem_cache_free+0x77/0x220
[ 3083.158012]  i915_gem_request_retire+0x72c/0xa70 [i915]
[ 3083.158051]  i915_gem_request_alloc+0x1e9/0x8b0 [i915]
[ 3083.158089]  i915_gem_do_execbuffer+0xa96/0x19c0 [i915]
[ 3083.158127]  i915_gem_execbuffer2+0x1ad/0x550 [i915]
[ 3083.158140]  drm_ioctl_kernel+0xa7/0xf0 [drm]
[ 3083.158153]  drm_ioctl+0x45b/0x560 [drm]
[ 3083.158155]  do_vfs_ioctl+0x13b/0x880
[ 3083.158156]  SyS_ioctl+0x74/0x80
[ 3083.158158]  entry_SYSCALL_64_fastpath+0x1a/0x7d

[ 3083.158162] The buggy address belongs to the object at ffff8806bf20f400
                which belongs to the cache i915_dependency of size 64
[ 3083.158166] The buggy address is located 0 bytes inside of
                64-byte region [ffff8806bf20f400, ffff8806bf20f440)
[ 3083.158168] The buggy address belongs to the page:
[ 3083.158171] page:00000000d43decc4 count:1 mapcount:0 mapping:          (null) index:0x0
[ 3083.158174] flags: 0x17ffe0000000100(slab)
[ 3083.158179] raw: 017ffe0000000100 0000000000000000 0000000000000000 0000000180200020
[ 3083.158182] raw: ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000
[ 3083.158184] page dumped because: kasan: bad access detected

[ 3083.158187] Memory state around the buggy address:
[ 3083.158190]  ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158192]  ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158195] >ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158196]                    ^
[ 3083.158199]  ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158201]  ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
[ 3083.158203] ==================================================================

Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Reported-by: Mike Keehan <mike@keehan.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436
Fixes: 1f181225f8 ("drm/i915/execlists: Keep request->priority for its lifetime")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk
2018-01-08 09:16:02 +00:00
Chris Wilson
2221c5b7dd drm/i915/execlists: Reduce list_for_each_safe+list_safe_reset_next
After staring at the list_for_each_safe macros for a bit, our current
invocation of list_safe_reset_next in execlists_schedule() simply
reduces to list_for_each.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-11-chris@chris-wilson.co.uk
2018-01-03 12:09:46 +00:00
Chris Wilson
ce01b17377 drm/i915/execlists: Assert there are no simple cycles in the dependencies
The dependency chain must be an acyclic graph. This is checked by the
swfence, but for sanity, also do a simple check that we do not corrupt
our list iteration in execlists_schedule() by a shallow dependency
cycle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-10-chris@chris-wilson.co.uk
2018-01-03 12:09:44 +00:00
Chris Wilson
83cc84c5a8 drm/i915: Assert all signalers we depended on did indeed signal
Back up our comment that all signalers should have been signaled before
we ourselves were retired with an assert to that effect.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-9-chris@chris-wilson.co.uk
2018-01-03 12:05:41 +00:00
Chris Wilson
f3c9d40757 drm/i915/execlists: Tidy enabling execlists
Move the register settings for enabling execlists into its own function
for clarity.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-18-chris@chris-wilson.co.uk
2018-01-03 11:02:43 +00:00
Chris Wilson
693cfbf058 drm/i915/execlists: Record elsp offset during engine setup
Currently, we record the elsp register offset inside init-hw but we only
need to do it once during engine setup (after we know the mmio
iomapping).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-17-chris@chris-wilson.co.uk
2018-01-03 11:01:38 +00:00
Chris Wilson
4223221361 drm/i915/execlists: Clear context-switch interrupt earlier in the reset
Move the clearing of the CS-interrupt into the engine reset phase,
before the current init-hw phase. This helps clarify that we clear the
pending interrupts prior to any restarting of the execlists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180102151235.3949-16-chris@chris-wilson.co.uk
2018-01-03 11:01:05 +00:00
Chris Wilson
193a98dc7c drm/i915/execlists: Show preemption progress in GEM_TRACE
We already emit a GEM_TRACE for when we start preemption, but we lack
one to show when the preemption is completed and we return to the regular
queue. This is to continue the investigation into the mysterious

<0>[  197.854177]   <idle>-0       1..s1 197837017us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0], tail=0 [0]
<0>[  197.854209] drv_self-6008    2.... 197837390us : reset_common_ring: rcs0 seqno=15515
<0>[  197.854240] drv_self-6008    2.... 197837415us : reset_common_ring: bcs0 seqno=0
<0>[  197.854270] drv_self-6008    2.... 197837443us : reset_common_ring: vcs0 seqno=0
<0>[  197.854300] drv_self-6008    2.... 197837463us : reset_common_ring: vcs1 seqno=0
<0>[  197.854330] drv_self-6008    2.... 197837482us : reset_common_ring: vecs0 seqno=0
<0>[  197.854360] ksoftirq-23      2..s. 197838341us : execlists_submission_tasklet: bcs0 in[0]:  ctx=0.1, seqno=1dce7
<0>[  197.854392]   <idle>-0       1..s1 197838347us : execlists_submission_tasklet: bcs0 cs-irq head=0 [0], tail=0 [0]
<0>[  197.854423] ksoftirq-23      2..s. 197838354us : execlists_submission_tasklet: vcs0 in[0]:  ctx=0.1, seqno=1d027
<0>[  197.854456] ksoftirq-23      2.Ns. 197838361us : execlists_submission_tasklet: vcs1 in[0]:  ctx=0.1, seqno=1e738
<0>[  197.854488] ksoftirq-23      2.Ns. 197838366us : execlists_submission_tasklet: vecs0 in[0]:  ctx=0.1, seqno=235aa
<0>[  197.854520] ksoftirq-23      2.Ns. 197838376us : execlists_submission_tasklet: rcs0 in[0]:  ctx=0.1, seqno=15518
<0>[  197.854552]   <idle>-0       1..s1 197853285us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0], tail=7 [7]
<0>[  197.854584]   <idle>-0       1..s1 197853285us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000
<0>[  197.854616]   <idle>-0       1..s1 197853286us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171222132742.4272-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-12-22 18:32:51 +00:00
Chris Wilson
16a8739402 drm/i915: Tidy up GEM_TRACE around execlists
Looking at the coordination of resets with the submission of execlists,
it will be useful to have a GEM_TRACE for when we issue the reset.
Whilst there tidy up the other GEM_TRACE to always include the engine
name, and be careful not to trust any pointers prior to asserts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171220090626.31643-1-chris@chris-wilson.co.uk
2017-12-20 13:24:34 +00:00
Chris Wilson
16c8619a7c drm/i915: Avoid context dereference inside execlists_submission_tasklet
A lesson that has to be relearnt over and over again is that the request
does not keep a reference to the context and so we cannot freely
dereference the context from inside the execlists_submission_tasklet. In
particular, we try to do so in the new GEM_TRACE() so convert those over
to the port->context_id we keep for GEM debugging. This means the
tracing now depends on DRM_I915_GEM_DEBUG.

Fixes: bccd3b8311 ("drm/i915: Use trace_printk to provide a death rattle for GEM")
References: https://bugs.freedesktop.org/show_bug.cgi?id=104066
References: https://bugs.freedesktop.org/show_bug.cgi?id=104162
References: https://bugs.freedesktop.org/show_bug.cgi?id=104242
References: https://bugs.freedesktop.org/show_bug.cgi?id=104310
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171219220916.30882-1-chris@chris-wilson.co.uk
2017-12-19 23:04:45 +00:00
Chris Wilson
2fc7a06ad5 drm/i915/execlists: Cache ELSP register offset
Currently on every submission, we recalculate the ELSP register offset
for the engine, after chasing the pointers to find the iomem base. Since
this is fixed for the lifetime of the driver, record the offset in the
execlists struct.

In practice the difference is negligible, it just happens to remove 27
bytes of eyesore pointer dancing from next to the hottest instruction
(which is itself due to stalling for a cache miss) in perf profiles of
the execlists_submission_tasklet().

v2: Trim off one more elsp local.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171207222434.17686-1-chris@chris-wilson.co.uk
2017-12-08 00:37:05 +00:00
Tvrtko Ursulin
cf669b4e9f drm/i915: Consolidate checks for engine stats availability
Sagar noticed the check can be consolidated between the engine stats
implementation and the PMU.

My first choice was a static inline helper but that got into include
ordering mess quickly fast so I went with a macro instead. At some point
we should perhaps looking into taking out the non-ringubffer bits from
intel_ringbuffer.h into a new intel_engine.h or something.

v2: Use engine->flags. (Chris Wilson)
v3: Rebase and mark GuC as not yet supported. (Chris Wilson)
v4: Move flag setting to intel_engines_reset_default_submission.
    (Chris Wilson)
v5: Move flag setting to logical_ring_setup.
v6: intel_engines_reset_default_submission is the wrong place to set the
    flag - it needs to be in execlists_set_default_submission. (Sagar)
v7: Flag setting in logical_ring_setup is not required. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v6)
Link: https://patchwork.freedesktop.org/patch/msgid/20171129102805.22690-1-tvrtko.ursulin@linux.intel.com
2017-11-29 12:29:40 +00:00
Tvrtko Ursulin
30e17b7847 drm/i915: Engine busy time tracking
Track total time requests have been executing on the hardware.

We add new kernel API to allow software tracking of time GPU
engines are spending executing requests.

Both per-engine and global API is added with the latter also
being exported for use by external users.

v2:
 * Squashed with the internal API.
 * Dropped static key.
 * Made per-engine.
 * Store time in monotonic ktime.

v3: Moved stats clearing to disable.

v4:
 * Comments.
 * Don't export the API just yet.

v5: Whitespace cleanup.

v6:
 * Rename ref to active.
 * Drop engine aggregate stats for now.
 * Account initial busy period after enabling stats.

v7:
 * Rebase.

v8:
 * Move context in notification after the notifier. (Chris Wilson)

v9:

In cases where stats tracking is getting disabled while there is
an active context on an engine, add up the current value to the
total. This also implies we don't clear the total when tracking
is disabled any longer. There is no real need to do so because
we define the stats as relative while enabled, meaning
comparison between two samples while tracking is enabled is the
valid usage. However, when busy stats will later be plugged into
the perf PMU API, it is beneficial to not reset the total, since
the PMU core likes to do some counter disable/enable cycles on
startup, and while doing so during a single long context
executing on an engine we would lose some accuracy and so make
unit testing more difficult than needs to be.

v10:
 * Fix accounting for preemption.

v11:
 * Rebase for i915_modparams.enable_execlists removal.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
2017-11-22 11:25:02 +00:00
Tvrtko Ursulin
73fd9d3816 drm/i915: Wrap context schedule notification
No functional change just something which will be handy in the
following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-4-tvrtko.ursulin@linux.intel.com
2017-11-22 11:25:01 +00:00
Chris Wilson
fb5c551ad5 drm/i915: Remove i915.enable_execlists module parameter
Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
2017-11-20 21:53:59 +00:00
Michel Thierry
ba74cb10c7 drm/i915/execlists: Delay writing to ELSP until HW has processed the previous write
The hardware needs some time to process the information received in the
ExecList Submission Port, and expects us to not write anything more until
it has 'acknowledged' this new submission by sending an IDLE_ACTIVE or
PREEMPTED CSB event.

If we do not follow this, the driver could write new data into the ELSP
before HW had finishing fetching the previous one, putting us in
'undefined behaviour' space.

This seems to be the problem causing the spurious PREEMPTED & COMPLETE
events after a COMPLETE like the one below:

[] vcs0: sw rd pointer = 2, hw wr pointer = 0, current 'head' = 3.
[] vcs0:  Execlist CSB[0]: 0x00000018 _ 0x00000007
[] vcs0:  Execlist CSB[1]: 0x00000001 _ 0x00000000
[] vcs0:  Execlist CSB[2]: 0x00000018 _ 0x00000007  <<< COMPLETE
[] vcs0:  Execlist CSB[3]: 0x00000012 _ 0x00000007  <<< PREEMPTED & COMPLETE
[] vcs0:  Execlist CSB[4]: 0x00008002 _ 0x00000006
[] vcs0:  Execlist CSB[5]: 0x00000014 _ 0x00000006

The ELSP writes that lead to this CSB sequence show that the HW hadn't
started executing the previous execlist (the one with only ctx 0x6) by the
time the new one was submitted; this is a bit more clear in the data
show in the EXECLIST_STATUS register at the time of the ELSP write.

[] vcs0: ELSP[0] = 0x0_0        [execlist1] - status_reg = 0x0_302
[] vcs0: ELSP[1] = 0x6_fedb2119 [execlist0] - status_reg = 0x0_8302

[] vcs0: ELSP[2] = 0x7_fedaf119 [execlist1] - status_reg = 0x0_8308
[] vcs0: ELSP[3] = 0x6_fedb2119 [execlist0] - status_reg = 0x7_8308

Note that having to wait for this ack does not disable lite-restores,
although it may reduce their numbers.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102035
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/<20171118003038.7935-1-michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-4-chris@chris-wilson.co.uk
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-20 17:01:38 +00:00
Chris Wilson
1f5f9edb44 drm/i915/execlists: Assert that we don't get mixed IDLE_ACTIVE | COMPLETE events
If IDLE_ACTIVE is set, then all other bits are invalid. For us, we can
assert that if we see a COMPLETE | PREEMPTED event, then it should be
impossible for it to also contain an IDLE_ACTIVE flag.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20 14:50:45 +00:00
Chris Wilson
d8747afb11 drm/i915/execlists: Reduce completed event mask to COMPLETE | PREEMPTED
Since we get a COMPLETE event when the context switch occurs on
RING_HEAD == RING_TAIL and a PREEMPTED event when a switch occurs
before that point, COMPLETE | PREEMPTED should cover all possible context
switch completion events. We can move the ELEMENT_SWITCH info message
from the COMPLETED_MASK into an assertion for when we are performing a
switch to port[1].

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-20 14:50:03 +00:00
Chris Wilson
e40dd22624 drm/i915/execlists: Listen to COMPLETE context event not ACTIVE_IDLE
Since commit e1fee72c2e
Author: Oscar Mateo <oscar.mateo@intel.com>
Date:   Thu Jul 24 17:04:40 2014 +0100

    drm/i915/bdw: Avoid non-lite-restore preemptions

execlists has listened to (ACTIVE_IDLE | ELEMENT_SWITCH) for detecting
when one context completed and it either continued onto the next (in port
1) or idled. We would always see COMPLETE | ACTIVE_IDLE on the final
context-switch event, but on recent gen it appears that we now get
separate ACTIVE_IDLE and COMPLETE events. In particular, the ACTIVE_IDLE
events may not be coupled to a context (since it is a general state rather
than a specific context completion event).

v2: Update the history, execlists did originally start out by listening
to the COMPLETE event not ACTIVE_IDLE.
v3: Update preempt completion test to also use COMPLETE not ACTIVE_IDLE.

References: bspec/12255
References: https://bugs.freedesktop.org/show_bug.cgi?id=103800
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120123458.23242-1-chris@chris-wilson.co.uk
2017-11-20 14:49:16 +00:00
Sagar Arun Kamble
c6dce8f140 drm/i915: Update execlists tasklet naming
intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related
tasklet functions. Name them with "submission_tasklet" suffix and
remove intel/i915 prefix as they are static. Also rename irq_tasklet
as just tasklet for clarity.

v2: s/_bh/_tasklet (Chris)

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:01:31 +00:00
Chris Wilson
fd13821219 drm/i915: Make request's wait-for-space explicit
At the start of building a request, we would wait for roughly enough
space to fit the average request (to reduce the likelihood of having to
wait and abort partway through request construction). To achieve we
would try to begin a 0-length command packet, this just adds extra
confusion so make the wait-for-space explicit, as in the next patch we
want to move it from the backend to the i915_gem_request_alloc() so it
can ensure that the wait-for-space is the first operation in building a
new request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171115151204.8105-2-chris@chris-wilson.co.uk
2017-11-15 17:12:49 +00:00
Chris Wilson
7c2fa7faf1 drm/i915: Stop caching the "golden" renderstate
As we now record the default HW state and so only emit the "golden"
renderstate once to prepare the HW, there is no advantage in keeping the
renderstate batch around as it will never be used again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk
2017-11-10 17:23:22 +00:00