Commit Graph

190 Commits

Author SHA1 Message Date
Chris Wilson
e8a70cab25 drm/i915: Use snprintf to avoid line-break when pretty-printing engines
When printing the execlist ports, we first print the ELSP header then
follow it with the pretty-printed request. Since switching to
drm_printer and show the output via printk, it automatically appends a
newline to each call (unlike the old seq_printf output). To avoid the
unwanted line break, construct the ELSP request header in a temporary
buffer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171208012303.25504-1-chris@chris-wilson.co.uk
2017-12-08 18:48:34 +00:00
Valtteri Rantala
7436830c8d drm/i915/glk: Apply WaProgramL3SqcReg1DefaultForPerf for GLK too
Testing the texture read performance shows that the same tuning for
the SQ credits is needed on GLK as on BXT/APL. This has been also
confirmed by Altug from the HW team.

V4: Rebase + fix
Signed-off-by: Valtteri Rantala <valtteri.rantala@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com> (v1)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1511880305-12166-1-git-send-email-valtteri.rantala@intel.com
2017-11-30 13:32:51 +02:00
Tvrtko Ursulin
cf669b4e9f drm/i915: Consolidate checks for engine stats availability
Sagar noticed the check can be consolidated between the engine stats
implementation and the PMU.

My first choice was a static inline helper but that got into include
ordering mess quickly fast so I went with a macro instead. At some point
we should perhaps looking into taking out the non-ringubffer bits from
intel_ringbuffer.h into a new intel_engine.h or something.

v2: Use engine->flags. (Chris Wilson)
v3: Rebase and mark GuC as not yet supported. (Chris Wilson)
v4: Move flag setting to intel_engines_reset_default_submission.
    (Chris Wilson)
v5: Move flag setting to logical_ring_setup.
v6: intel_engines_reset_default_submission is the wrong place to set the
    flag - it needs to be in execlists_set_default_submission. (Sagar)
v7: Flag setting in logical_ring_setup is not required. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v6)
Link: https://patchwork.freedesktop.org/patch/msgid/20171129102805.22690-1-tvrtko.ursulin@linux.intel.com
2017-11-29 12:29:40 +00:00
Tvrtko Ursulin
30e17b7847 drm/i915: Engine busy time tracking
Track total time requests have been executing on the hardware.

We add new kernel API to allow software tracking of time GPU
engines are spending executing requests.

Both per-engine and global API is added with the latter also
being exported for use by external users.

v2:
 * Squashed with the internal API.
 * Dropped static key.
 * Made per-engine.
 * Store time in monotonic ktime.

v3: Moved stats clearing to disable.

v4:
 * Comments.
 * Don't export the API just yet.

v5: Whitespace cleanup.

v6:
 * Rename ref to active.
 * Drop engine aggregate stats for now.
 * Account initial busy period after enabling stats.

v7:
 * Rebase.

v8:
 * Move context in notification after the notifier. (Chris Wilson)

v9:

In cases where stats tracking is getting disabled while there is
an active context on an engine, add up the current value to the
total. This also implies we don't clear the total when tracking
is disabled any longer. There is no real need to do so because
we define the stats as relative while enabled, meaning
comparison between two samples while tracking is enabled is the
valid usage. However, when busy stats will later be plugged into
the perf PMU API, it is beneficial to not reset the total, since
the PMU core likes to do some counter disable/enable cycles on
startup, and while doing so during a single long context
executing on an engine we would lose some accuracy and so make
unit testing more difficult than needs to be.

v10:
 * Fix accounting for preemption.

v11:
 * Rebase for i915_modparams.enable_execlists removal.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-5-tvrtko.ursulin@linux.intel.com
2017-11-22 11:25:02 +00:00
Tvrtko Ursulin
b46a33e271 drm/i915/pmu: Expose a PMU interface for perf queries
From: Chris Wilson <chris@chris-wilson.co.uk>
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
From: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>

The first goal is to be able to measure GPU (and invidual ring) busyness
without having to poll registers from userspace. (Which not only incurs
holding the forcewake lock indefinitely, perturbing the system, but also
runs the risk of hanging the machine.) As an alternative we can use the
perf event counter interface to sample the ring registers periodically
and send those results to userspace.

Functionality we are exporting to userspace is via the existing perf PMU
API and can be exercised via the existing tools. For example:

  perf stat -a -e i915/rcs0-busy/ -I 1000

Will print the render engine busynnes once per second. All the performance
counters can be enumerated (perf list) and have their unit of measure
correctly reported in sysfs.

v1-v2 (Chris Wilson):

v2: Use a common timer for the ring sampling.

v3: (Tvrtko Ursulin)
 * Decouple uAPI from i915 engine ids.
 * Complete uAPI defines.
 * Refactor some code to helpers for clarity.
 * Skip sampling disabled engines.
 * Expose counters in sysfs.
 * Pass in fake regs to avoid null ptr deref in perf core.
 * Convert to class/instance uAPI.
 * Use shared driver code for rc6 residency, power and frequency.

v4: (Dmitry Rogozhkin)
 * Register PMU with .task_ctx_nr=perf_invalid_context
 * Expose cpumask for the PMU with the single CPU in the mask
 * Properly support pmu->stop(): it should call pmu->read()
 * Properly support pmu->del(): it should call stop(event, PERF_EF_UPDATE)
 * Introduce refcounting of event subscriptions.
 * Make pmu.busy_stats a refcounter to avoid busy stats going away
   with some deleted event.
 * Expose cpumask for i915 PMU to avoid multiple events creation of
   the same type followed by counter aggregation by perf-stat.
 * Track CPUs getting online/offline to migrate perf context. If (likely)
   cpumask will initially set CPU0, CONFIG_BOOTPARAM_HOTPLUG_CPU0 will be
   needed to see effect of CPU status tracking.
 * End result is that only global events are supported and perf stat
   works correctly.
 * Deny perf driver level sampling - it is prohibited for uncore PMU.

v5: (Tvrtko Ursulin)

 * Don't hardcode number of engine samplers.
 * Rewrite event ref-counting for correctness and simplicity.
 * Store initial counter value when starting already enabled events
   to correctly report values to all listeners.
 * Fix RC6 residency readout.
 * Comments, GPL header.

v6:
 * Add missing entry to v4 changelog.
 * Fix accounting in CPU hotplug case by copying the approach from
   arch/x86/events/intel/cstate.c. (Dmitry Rogozhkin)

v7:
 * Log failure message only on failure.
 * Remove CPU hotplug notification state on unregister.

v8:
 * Fix error unwind on failed registration.
 * Checkpatch cleanup.

v9:
 * Drop the energy metric, it is available via intel_rapl_perf.
   (Ville Syrjälä)
 * Use HAS_RC6(p). (Chris Wilson)
 * Handle unsupported non-engine events. (Dmitry Rogozhkin)
 * Rebase for intel_rc6_residency_ns needing caller managed
   runtime pm.
 * Drop HAS_RC6 checks from the read callback since creating those
   events will be rejected at init time already.
 * Add counter units to sysfs so perf stat output is nicer.
 * Cleanup the attribute tables for brevity and readability.

v10:
 * Fixed queued accounting.

v11:
 * Move intel_engine_lookup_user to intel_engine_cs.c
 * Commit update. (Joonas Lahtinen)

v12:
 * More accurate sampling. (Chris Wilson)
 * Store and report frequency in MHz for better usability from
   perf stat.
 * Removed metrics: queued, interrupts, rc6 counters.
 * Sample engine busyness based on seqno difference only
   for less MMIO (and forcewake) on all platforms. (Chris Wilson)

v13:
 * Comment spelling, use mul_u32_u32 to work around potential GCC
   issue and somne code alignment changes. (Chris Wilson)

v14:
 * Rebase.

v15:
 * Rebase for RPS refactoring.

v16:
 * Use the dynamic slot in the CPU hotplug state machine so that we are
   free to setup our state as multi-instance. Previously we were re-using
   the CPUHP_AP_PERF_X86_UNCORE_ONLINE slot which is neither used as
   multi-instance, nor owned by our driver to start with.
 * Register the CPU hotplug handlers after the PMU, otherwise the callback
   will get called before the PMU is initialized which can end up in
   perf_pmu_migrate_context with an un-initialized base.
 * Added workaround for a probable bug in cpuhp core.

v17:
 * Remove workaround for the cpuhp bug.

v18:
 * Rebase for drm_i915_gem_engine_class getting upstream before us.

v19:
 * Rebase. (trivial)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171121181852.16128-2-tvrtko.ursulin@linux.intel.com
2017-11-22 11:24:57 +00:00
Chris Wilson
93c6e966b4 drm/i915: Remove i915.semaphores modparam
Having disabled the broken semaphores on Sandybridge, there is no need
for a modparam any more, so remove it in favour of a simple
HAS_LEGACY_SEMAPHORES() guard.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-5-chris@chris-wilson.co.uk
2017-11-20 21:59:09 +00:00
Chris Wilson
af9ff6c70d drm/i915: Move debugfs/i915_semaphore_status to i915_engine_info
As the semaphores is just part of the engine, include it with the
general pretty printer universally used for debugging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-4-chris@chris-wilson.co.uk
2017-11-20 21:59:08 +00:00
Chris Wilson
79e6770cb1 drm/i915: Remove obsolete ringbuffer emission for gen8+
Since removing the module parameter to force selection of ringbuffer
emission for gen8, the code is defunct. Remove it.

To put the difference into perspective, a couple of microbenchmarks
(bdw i7-5557u, 20170324):
                                        ring          execlists
exec continuous nops on all rings:   1.491us            2.223us
exec sequential nops on each ring:  12.508us           53.682us
single nop + sync:                   9.272us           30.291us

vblank_mode=0 glxgears:            ~11000fps           ~9000fps

Since the earlier submission, gen8 ringbuffer submission has fallen
further and further behind in features. So while ringbuffer may hold the
throughput crown, in terms of interactive latency, execlists is much
better. Alas, we have no convenient metrics for such, other than
demonstrating things we can do with execlists but can not using
legacy ringbuffer submission.

We have made a few improvements to lowlevel execlists throughput,
and ringbuffer currently panics on boot! (bdw i7-5557u, 20171026):

                                        ring          execlists
exec continuous nops on all rings:       n/a            1.921us
exec sequential nops on each ring:       n/a           44.621us
single nop + sync:                       n/a           21.953us

vblank_mode=0 glxgears:                  n/a          ~18500fps

References: https://bugs.freedesktop.org/show_bug.cgi?id=87725
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Once-upon-a-time-Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-2-chris@chris-wilson.co.uk
2017-11-20 21:54:58 +00:00
Chris Wilson
fb5c551ad5 drm/i915: Remove i915.enable_execlists module parameter
Execlists and legacy ringbuffer submission are no longer feature
comparable (execlists now offer greater functionality that should
overcome their performance hit) and obsoletes the unsafe module
parameter, i.e. comparing the two modes of execution is no longer
useful, so remove the debug tool.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #i915_perf.c
Link: https://patchwork.freedesktop.org/patch/msgid/20171120205504.21892-1-chris@chris-wilson.co.uk
2017-11-20 21:53:59 +00:00
Sagar Arun Kamble
c6dce8f140 drm/i915: Update execlists tasklet naming
intel_lrc_irq_handler and i915_guc_irq_handler are HW submission related
tasklet functions. Name them with "submission_tasklet" suffix and
remove intel/i915 prefix as they are static. Also rename irq_tasklet
as just tasklet for clarity.

v2: s/_bh/_tasklet (Chris)

Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1510839162-25197-2-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-16 15:01:31 +00:00
Chris Wilson
70a84f3c60 drm/i915: Unconditionally apply the Broxton register workaround set
Having removed the preproduction Broxton support (see commit 0102ba1fd8
("drm/i915: Add early BXT sdv to the list of preproduction machines")),
we know we then always need the production Broxton workaround set and do
not need a predicate upon revision.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114134340.5439-2-chris@chris-wilson.co.uk
2017-11-14 15:16:31 +00:00
Chris Wilson
f3e2b2c540 drm/i915: Remove pre-production Broxton register workarounds
We've begun excluding pre-production Broxton machines since commit
0102ba1fd8 ("drm/i915: Add early BXT sdv to the list of preproduction
machines"), now remove the list of workaround register values for those
early machines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170927093325.24206-1-chris@chris-wilson.co.uk
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171114134340.5439-1-chris@chris-wilson.co.uk
2017-11-14 15:16:30 +00:00
Chris Wilson
34991bd48c drm/i915: Unify SLICE_UNIT_LEVEL_CLKGATE w/a for cnl
gem_workarounds reports that the SLICE_UNIT_LEVEL_CLKGATE write isn't
sticking. Commit 0a60797a0e ("drm/i915: Implement
ReadHitWriteOnlyDisable.") presumes that SLICE_UNIT_LEVEL_CLKGATE is a
masked register in the context image, but commit 90007bca61
("drm/i915/cnl: Introduce initial Cannonlake Workarounds.") lists it as
an ordering unmasked register. The masked write will be losing the
default settings if we trust the original commit. That gem_workarounds
reports the value is lost entirely is more worrying though -- but it
clearly suggests that it is not a masked register in the context image,
so unify both w/a to use the original rmw.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103705
Fixes: 0a60797a0e ("drm/i915: Implement ReadHitWriteOnlyDisable.")
References: 90007bca61 ("drm/i915/cnl: Introduce initial Cannonlake Workarounds.")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171111100336.11020-1-chris@chris-wilson.co.uk
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
2017-11-14 15:16:18 +00:00
Michel Thierry
ce453b3e42 drm/i915: Clear per-engine fault register as early as possible
From gen6, the hardware tracks address lookup failures and we should
clear those registers upon startup to prevent false positives. However,
this was happening before we have the engines defined (intel_uncore_init())
and the for_each_engine loop was just a nop. The earliest we can call
this is inside intel_engines_init_mmio().

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171111004448.12360-1-michel.thierry@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-11-13 19:03:13 +00:00
Chris Wilson
7c2fa7faf1 drm/i915: Stop caching the "golden" renderstate
As we now record the default HW state and so only emit the "golden"
renderstate once to prepare the HW, there is no advantage in keeping the
renderstate batch around as it will never be used again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-8-chris@chris-wilson.co.uk
2017-11-10 17:23:22 +00:00
Chris Wilson
d2b4b97933 drm/i915: Record the default hw state after reset upon load
Take a copy of the HW state after a reset upon module loading by
executing a context switch from a blank context to the kernel context,
thus saving the default hw state over the blank context image.
We can then use the default hw state to initialise any future context,
ensuring that each starts with the default view of hw state.

v2: Unmap our default state from the GTT after stealing it from the
context. This should stop us from accidentally overwriting it via the
GTT (and frees up some precious GTT space).

Testcase: igt/gem_ctx_isolation
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-7-chris@chris-wilson.co.uk
2017-11-10 17:23:10 +00:00
Chris Wilson
ae6c457478 drm/i915: Force the switch to the i915->kernel_context
In the next few patches, we will have a hard requirement that we emit a
context-switch to the perma-pinned i915->kernel_context (so that we can
save the HW state using that context-switch). As the first context
itself may be classed as a kernel context, we want to be explicit in our
comparison. For an extra-layer of finesse, we can check the last
unretired context on the engine; as well as the last retired context
when idle.

v2: verbose verbosity
v3: Always force the switch, even when the engine is idle, and update
the assert that this happens before suspend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-2-chris@chris-wilson.co.uk
2017-11-10 17:20:25 +00:00
Tvrtko Ursulin
1803fcbca2 drm/i915: Define an engine class enum for the uABI
We want to be able to report back to userspace details about an engine's
class, and in return for userspace to be able to request actions
regarding certain classes of engines. To isolate the uABI from any
variations between hw generations, we define an abstract class for the
engines and internally map onto the hw.

v2: Remove MAX from the uABI; keep it internal if we need it, but don't
let userspace make the mistake of using it themselves.
v3: s/OTHER/INVALID/
  The use of OTHER is ill-defined, so remove it from the uABI as any
  future new type of engine can define a class to suit it. But keep a
  reserved value for an invalid class, so that we can always
  unambiguously express when something doesn't belong to the
  classification.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v2
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110142634.10551-1-chris@chris-wilson.co.uk
2017-11-10 17:20:24 +00:00
Chris Wilson
30b29406d9 drm/i915: Restore the wait for idle engine after flushing interrupts
So it appears that commit 5427f20785 ("drm/i915: Bump wait-times for
the final CS interrupt before parking") was a little over optimistic in
its belief that it had successfully waited for all residual activity on
the engines before parking. Numerous sightings in CI since then of

<7>[   52.542886] [IGT] core_auth: executing
<3>[   52.561013] [drm:intel_engines_park [i915]] *ERROR* vcs0 is not idle before parking
<7>[   52.561215] intel_engines_park vcs0
<7>[   52.561229] intel_engines_park 	current seqno 98, last 98, hangcheck 0 [-247449 ms], inflight 0
<7>[   52.561238] intel_engines_park 	Reset count: 0
<7>[   52.561266] intel_engines_park 	Requests:
<7>[   52.561363] intel_engines_park 	RING_START: 0x00000000 [0x00000000]
<7>[   52.561377] intel_engines_park 	RING_HEAD:  0x00000000 [0x00000000]
<7>[   52.561390] intel_engines_park 	RING_TAIL:  0x00000000 [0x00000000]
<7>[   52.561406] intel_engines_park 	RING_CTL:   0x00000000
<7>[   52.561422] intel_engines_park 	RING_MODE:  0x00000200 [idle]
<7>[   52.561442] intel_engines_park 	ACTHD:  0x00000000_00000000
<7>[   52.561459] intel_engines_park 	BBADDR: 0x00000000_00000000
<7>[   52.561474] intel_engines_park 	Execlist status: 0x00000301 00000000
<7>[   52.561489] intel_engines_park 	Execlist CSB read 5 [5 cached], write 5 [5 from hws], interrupt posted? no
<7>[   52.561500] intel_engines_park 		ELSP[0] idle
<7>[   52.561510] intel_engines_park 		ELSP[1] idle
<7>[   52.561519] intel_engines_park 		HW active? 0x0
<7>[   52.561608] intel_engines_park Idle? yes
<7>[   52.561617] intel_engines_park

on Braswell, which indicates that the engine just needs that little bit
longer after flushing the tasklet to settle. So give it a few more
milliseconds before declaring an err and applying the emergency brake.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171110112550.28909-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-10 12:55:57 +00:00
Rafael Antognolli
0a60797a0e drm/i915: Implement ReadHitWriteOnlyDisable.
The workaround for this is described as:

"if RenderSurfaceState.Num_Multisamples > 1, disable RCC clock gating if
RenderSurfaceState.Num_Multisamples == 1, set 0x7010[14] = 1"

Further documentation in the internal bug referenced by the bspec
suggest that any of the above suggestions should suffice to fix the
issue. We are going with disabling RCC clock gating.

Unfortunately, what we are doing doesn't match the name of the
workaround, but at least it matches its description.

This change improves CNL stability by avoiding some of the hangs seen in
the platform.

v2: Only disable RCC clock gating.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171103183027.5051-1-rafael.antognolli@intel.com
2017-11-08 12:43:17 -08:00
Chris Wilson
c400cc2a1a drm/i915: Include intel_engine_is_idle() status in engine pretty-printer
Upon parking, if we discover an active engine we dump its state. Follow
that state with an indication of whether the engine was idle.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171107152211.19930-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-11-08 15:40:30 +00:00
Chris Wilson
820c5bbbf4 drm/i915: Flush the irq and tasklets before asserting engine is idle
Before we assert that the engine is idle, make sure we flush any
residual tasklet. After that point, if the engine is not idle, more work
may be queued despite us trying to park the engine and go to sleep.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171101202149.32493-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-02 11:24:59 +00:00
Chris Wilson
3265124a2d drm/i915: Give more details for the active-when-parking warning for the engines
If the we think the engine is still active when we attempt to park it,
we want more details -- so dump the engine state.

References: https://bugs.freedesktop.org/show_bug.cgi?id=103479
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171027110617.31745-4-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-01 15:09:39 +00:00
Chris Wilson
680273879d drm/i915: Move parking-while-active warning to intel_engines_park()
We will want to break this down to give detailed per-engine warnings as
to why we still think we are active as we attempt to park the engines.
For the first step, just move the warning verbatim from the idle-worker
to intel_engines_park().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171027110617.31745-3-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-11-01 15:09:12 +00:00
Chris Wilson
3c75de5b98 drm/i915: Include RING_MODE when dumping the engine state
Knowing the RING_MODE flags is useful for checking the state of the
engine, such as whether the CS is idle after trying to stop the engines
before reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026115048.20144-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-10-26 21:35:21 +01:00
Michał Winiarski
a4598d1755 drm/i915: Rename helpers used for unwinding, use macro for can_preempt
We would also like to make use of execlist_cancel_port_requests and
unwind_incomplete_requests in GuC preemption backend.
Let's rename the functions to use the correct prefixes, so that we can
simply add the declarations in the following patch.
Similar thing for applies for can_preempt, except we're introducing
HAS_LOGICAL_RING_PREEMPTION macro instad, converting other users that
were previously touching device info directly.

v2: s/intel_engine/execlists and pass execlists to unwind (Chris)
v3: use locked version for exporting, drop const qual (Chris)

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025200020.16636-11-michal.winiarski@intel.com
2017-10-26 21:35:21 +01:00
Chris Wilson
aba5e27858 drm/i915: Add a hook for making the engines idle (parking) and unparking
In the next patch, we will want to install a callback when the engines
(GT as a whole) become idle and similarly when they first become busy.
To enable that callback, first rename intel_engines_mark_idle() to
intel_engines_park() and provide the companion intel_engines_unpark().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171025143943.7661-2-chris@chris-wilson.co.uk
2017-10-25 18:47:30 +01:00
Chris Wilson
20ccd4d3f6 drm/i915: Use same test for eviction and submitting kernel context
During evict, we wish to idle the GPU if we see that the GGTT is full.
However, our test for idle in i915_gem_evict_something() and in
i915_gem_switch_to_kernel_context() do not match leading to
disappointment - we never believe that we are idle and keep trying to
flush the GGTT ad infinitum.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103438
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171024220855.30155-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-10-25 12:13:03 +01:00
Chris Wilson
4a118ecbe9 drm/i915: Filter out spurious execlists context-switch interrupts
Back in commit a4b2b01523 ("drm/i915: Don't mark an execlists
context-switch when idle") we noticed the presence of late
context-switch interrupts. We were able to filter those out by looking
at whether the ELSP remained active, but in commit beecec9017
("drm/i915/execlists: Preemption!") that became problematic as we now
anticipate receiving a context-switch event for preemption while ELSP
may be empty. To restore the spurious interrupt suppression, add a
counter for the expected number of pending context-switches and skip if
we do not need to handle this interrupt to make forward progress.

v2: Don't forget to switch on for preempt.
v3: Reduce the counter to a on/off boolean tracker. Declare the HW as
active when we first submit, and idle after the final completion event
(with which we confirm the HW says it is idle), and track each source
of activity separately. With a finite number of sources, it should aide
us in debugging which gets stuck.

Fixes: beecec9017 ("drm/i915/execlists: Preemption!")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171023213237.26536-3-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-10-24 15:54:37 +01:00
Oscar Mateo
930a784d02 drm/i915: Use a mask when applying WaProgramL3SqcReg1Default
Otherwise we are blasting other bits in GEN8_L3SQCREG1 that might be important
(although we probably aren't at the moment because 0 seems to be the default
for all the other bits).

v2: Extra parentheses (Michel)

Fixes: 050fc46 ("drm/i915:bxt: implement WaProgramL3SqcReg1DefaultForPerf")
Fixes: 450174f ("drm/i915/chv: Tune L3 SQC credits based on actual latencies")
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1508271945-14961-1-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-17 23:21:11 +01:00
Chris Wilson
a27d5a44ec drm/i915: Add in-flight request details to intel_engine_dump()
In the intel_engine_cs dumper, we were showing the request details for
the request queue but not of those requests already passed to the hw
(just a summary of the seqno). If we show those details, we can then
eliminate the entirely redundant and forgotten debugfs/i915_gem_request

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171015204310.17045-1-chris@chris-wilson.co.uk
Reviewed-by: Jeff McGee <jeff.mcgee@intel.com>
2017-10-16 21:13:05 +01:00
Weinan Li
1fd51d9d97 drm/i915: enable to read CSB and CSB write pointer from HWSP in GVT-g VM
Let GVT-g VM read the CSB and CSB write pointer from virtual HWSP, not all
the host support this feature, need to check the BIT(3) of caps in PVINFO.

v3 : Remove unnecessary comments.
v4 : Separate VM enable patch with GVT-g implementation patch due to code
dependency.
v5 : Use inline for GVT virtual HWSP caps check function.
v6 : Comments refine.

Signed-off-by: Weinan Li <weinan.z.li@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1508039725-1066-1-git-send-email-weinan.z.li@intel.com
2017-10-16 13:56:29 +03:00
Chris Wilson
f636edb214 drm/i915: Make i915_engine_info pretty printer to standalone
We can use drm_printer to hide the differences between printk and
seq_printf, and so make the i915_engine_info pretty printer able to be
called from different contexts and not just debugfs. For instance, I
want to use the pretty printer to debug kselftests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171009110301.21705-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-10-09 17:07:28 +01:00
Oscar Mateo
3cf1934abe drm/i915/cnl: Do not add an extra page for precaution in the Gen10 LRC size
BSpec indicates exactly 16752 DWORDs (17 pages), plus one page for PPHWSP. Please
notice that, when looking at the BSpec context image table, the right filter has
to be applied (e.g. "CNL") as some rows are excluded for specific GENs.

BSpec: 1383

v2: Update count and add BSpec tag (Joonas)
v3: Warning about filters in the commit message (Joonas)

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: 7fd0b1a ("drm/i915/cnl: Add Gen10 LRC size")
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1507131592-29209-1-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-05 18:04:34 +01:00
Chris Wilson
e7af311683 drm/i915: Introduce a preempt context
Add another perma-pinned context for using for preemption at any time.
We cannot just reuse the existing kernel context, as first and foremost
we need to ensure that we can preempt the kernel context itself, so
require a distinct context id. Similar to the kernel context, we may
want to interrupt execution and switch to the preempt context at any
time, and so it needs to be permanently pinned and available.

To compensate for yet another permanent allocation, we shrink the
existing context and the new context by reducing their ringbuffer to the
minimum.

v2: Assert that we never allocate a request from the preemption context.
v3: Limit perma-pin to engines that may preempt.
v4: Onion cleanup for early driver death
v5: Onion ordering in main driver cleanup as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk
2017-10-04 17:52:45 +01:00
Michał Winiarski
5152defe4a drm/i915/preempt: Default to disabled mid-command preemption levels
Supporting fine-granularity preemption levels may require changes in
userspace batch buffer programming. Therefore, we need to fallback to
safe default values, rather that use hardware defaults. Userspace is
still able to enable fine-granularity, since we're whitelisting the
register controlling it in WaEnablePreemptionGranularityControlByUMD.

v2: Extend w/a to cover Cannonlake
v3: Fix commentary to include both fake w/a names.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-2-chris@chris-wilson.co.uk
2017-10-04 17:52:45 +01:00
Jeff McGee
1e998343f9 drm/i915/preempt: Fix WaEnablePreemptionGranularityControlByUMD
The WA applies to all production Gen9 and requires both enabling and
whitelisting of the per-context preemption control register.

v2: Extend to Cannonlake.

Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-1-chris@chris-wilson.co.uk
2017-10-04 17:52:45 +01:00
Chris Wilson
8d488bbec7 drm/i915: Remove WA_(SET|CLR)_BIT
These macros are of dubious merit when coupled with the per-context w/a
set. Instead of tweaking the value in the context, they tweak the value
based on the mmio at the time of recording; they are almost by
definition not per-context! Having removed the last users, remove the
macros to avoid temptation in the future.

v2: Kill WA_WRITE as well (now also unused).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171004124153.14142-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-10-04 15:05:40 +01:00
Chris Wilson
53221e11c7 drm/i915: Move MMCD_MISC_CTRL from context w/a to standard
Looking at gem_workarounds shows us that MMCD_MISC_CTRL is not restored
following a suspend-resume cycle. This implies that MMCD_MISC_CTRL is
not stored in the context, but is an ordinary register w/a that we need to
restore during init_hw.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171004124153.14142-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-10-04 15:05:40 +01:00
Oscar Mateo
32ced39c1b drm/i915: Transform whitelisting WAs into a simple reg write
RING_FORCE_TO_NONPRIV registers do not live in the logical context. They are simply
global privileged MMIO registers that happen to be powercontext saved and restored
(meaning only they can survive RC6). Therefore, there is absolutely no need to save
them so that they can be restored everytime we create a new logical context.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1506638439-6903-1-git-send-email-oscar.mateo@intel.com
Acked-by: Michel Thierry <michel.thierry@intel.com>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk> #bxt
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-10-04 15:05:39 +01:00
Mika Kuoppala
76e70087d3 drm/i915: Make execlist port count variable
As we emulate execlists on top of the GuC workqueue, it is not
restricted to just 2 ports and we can increase that number arbitrarily
to trade-off queue depth (i.e. scheduling latency) against pipeline
bubbles.

v2: rebase. better commit msg (Chris)
v3: rebase

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-5-mika.kuoppala@intel.com
2017-09-25 11:33:53 +03:00
Mika Kuoppala
19df9a5782 drm/i915: Move execlist initialization into intel_engine_cs.c
Move execlist init into a common engine setup. As it is
common to both guc and hw execlists.

v2: rebase with csb changes
v3: rebase

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-2-mika.kuoppala@intel.com
2017-09-25 11:33:32 +03:00
Mika Kuoppala
b620e87021 drm/i915: Make own struct for execlist items
Engine's execlist related items have been increasing to
a point where a separate struct is warranted. Carve execlist
specific items to a dedicated struct to add clarity.

v2: add kerneldoc and fix whitespace (Joonas, Chris)
v3: csb_mmio changes, rebase
v4: s/\b(el|execlist)\b/execlists/ (Joonas)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> (v3)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v3)
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170922124307.10914-1-mika.kuoppala@intel.com
2017-09-25 11:33:23 +03:00
Oscar Mateo
7fd0b1a259 drm/i915/cnl: Add Gen10 LRC size
The total size of the context has decreased with the removal of the
URB_ATOMIC section. BSpec indicates 16750 DWORDs (17 pages), plus
one page for PPHWSP, and I'm throwing an extra page for precaution.

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1506035989-14295-1-git-send-email-oscar.mateo@intel.com
2017-09-22 06:30:38 -07:00
Michal Wajdeczko
4f044a88a8 drm/i915: Rename global i915 to i915_modparams
Our global struct with params is named exactly the same way
as new preferred name for the drm_i915_private function parameter.
To avoid such name reuse lets use different name for the global.

v5: pure rename
v6: fix

Credits-to: Coccinelle

@@
identifier n;
@@
(
-	i915.n
+	i915_modparams.n
)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjala <ville.syrjala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170919193846.38060-1-michal.wajdeczko@intel.com
2017-09-22 14:50:36 +03:00
Ville Syrjälä
93564044fb drm/i915: Switch over to the LLC/eLLC hotspot avoidance hash mode for CCS
Use the LLC/eLLC hotspot avoidance mode for CCS on LLC machines. This is
reported to give better performance.

Testing has indicated that we don't need to enforce any massive 2 or 4
MiB alignment for all compressed resources even though there are still
plenty of stale comments in the spec suggesting that we do.

We do need to make sure every hardware unit that deals with the
compressed data uses the same hash mode.

Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Daniel Stone <daniels@collabora.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170824191100.10949-4-ville.syrjala@linux.intel.com
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
2017-09-14 15:02:53 +03:00
Chris Wilson
34a04e5e46 drm/i915: Allow HW status page to be bound high
At the time of commit 1f767e02d6 ("drm/i915: HWS must be in the
mappable region for g33"), drm_mm insertion would often default to
placing a new object high in the zone forcing us to specify that certain
HWSP must be bound within the low mappable region. Since then, drm_mm
has gained more finesse over its placement and exposes that to the
caller, commit 4e64e5539d ("drm: Improve drm_mm search (and fix
topdown allocation) with rbtrees"). As such where possible we want the
HWSP to be outside of the mappable aperture and so need to specify that
can be pinned high.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-4-chris@chris-wilson.co.uk
2017-09-13 15:02:52 +01:00
Daniele Ceraolo Spurio
486e93f72a drm/i915/lrc: allocate separate page for HWSP
On gen8+ we're currently using the PPHWSP of the kernel ctx as the
global HWSP. However, when the kernel ctx gets submitted (e.g. from
__intel_autoenable_gt_powersave) the HW will use that page as both
HWSP and PPHWSP. This causes a conflict in the register arena of the
HWSP, i.e. dword indices below 0x30. We don't current utilize this arena,
but in the following patches we will take advantage of the cached
register state for handling execlist's context status interrupt.

To avoid the conflict, instead of re-using the PPHWSP of the kernel
ctx we can allocate a separate page for the HWSP like what happens for
pre-execlists platform.

v2: Add a use-case for the register arena of the HWSP.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1499357440-34688-1-git-send-email-daniele.ceraolospurio@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170913085605.18299-3-chris@chris-wilson.co.uk
2017-09-13 15:02:39 +01:00
Oscar Mateo
212154ba4d drm/i915: Transform WaDisablePooledEuLoadBalancingFix into a simple register write
FF_SLICE_CS_CHICKEN2 does not belong to the context image.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-6-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:59:34 +01:00
Oscar Mateo
c6ea497c40 drm/i915: Transform WaDisableDynamicCreditSharing into a simple register write
GAMT_CHKN_BIT_REG does not live in the context image.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-5-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:59:28 +01:00
Oscar Mateo
4827c547c5 drm/i915: Transform WaDisableGafsUnitClkGating into a simple reg write
GEN7_UCGCTL4 does not live in the context.

v2: Missing parenthesis

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-4-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:59:20 +01:00
Oscar Mateo
b27f59010f drm/i915: WaPushConstantDereferenceHoldDisable needs to modify a masked register
So do it correctly.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-3-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:59:15 +01:00
Oscar Mateo
6cf20a0128 drm/i915: Transform WaDisableI2mCycleOnWRPort into a simple reg write
GAMT_CHKN_BIT_REG does not live in the context.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-2-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:59:07 +01:00
Oscar Mateo
efc886cb13 drm/i915: Transform WaInPlaceDecompressionHang into a simple reg write
Afaict, GEN9_GAMT_ECO_REG_RW_IA does not live in the context, so writing
it on every context creation is overkill (and wrong).

v2: Missing end parenthesis

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1504798809-5653-1-git-send-email-oscar.mateo@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-09-07 21:58:55 +01:00
Rodrigo Vivi
aa9f4c4f19 drm/i915/cnl: WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod)
Wa for B-stepping only.

A for a hang issue that requires throttling EU performace
to 12.5% to avoid back pressure to thread dispatch

v2: Rebased. No change from v1.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170906220325.24524-1-rodrigo.vivi@intel.com
2017-09-06 15:04:11 -07:00
Chris Wilson
90cad095ee drm/i915: Disable MI_STORE_DATA_IMM for i915g/i915gm
The early gen3 machines (i915g/Grantsdale and i915gm/Alviso) share a lot
of characteristics in their MI/GTT blocks with gen2, and in particular
can only use physical addresses in MI_STORE_DATA_IMM. This makes it
incompatible with our usage, so include those two machines in the
blacklist to prevent usage.

v2: Make it easy for gcc and rewrite it as a switch to save some space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20170906152859.5304-1-chris@chris-wilson.co.uk
2017-09-06 17:36:30 +01:00
Rodrigo Vivi
86ebb015fa drm/i915/cnl: WaDisableI2mCycleOnWRPort
On CNL B0 stepping GAM is not able to detect some deadlock
condition and then rise the rise the gam_coh_flush.

WA database and spec both mentions to set 4AB8[24]=1 as
workaround. Although register offset 0x4AB8 is not
documented for any platform.

References: HSD#1945815, BSID#1112

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170829230751.21047-1-rodrigo.vivi@intel.com
2017-08-30 21:57:10 -07:00
Rodrigo Vivi
392572feb0 drm/i915/cnl: WA FtrEnableFastAnisoL1BankingFix
WA to enable HW L1 Banking fix that allows aniso to operate
at full sample rate.

References: HSD#1937670

Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Anuj Phogat <anuj.phogat@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170829230723.20898-1-rodrigo.vivi@intel.com
2017-08-30 21:56:06 -07:00
Rodrigo Vivi
acfb5554c7 drm/i915/cnl: WaForceContextSaveRestoreNonCoherent
To avoid a potential hang condition with TLB invalidation
we need to enable masked bit 5 of MMIO 0xE5F0 at boot.

Same workaround was in place for previous platforms,
but the register offset has changed for CNL.
But also BSpec doesn't mention the bit 15 as set on gen9
platforms and mark bit as reserved on CNL.

v2: Improve commit message accepting Oscar's suggestion.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170823203504.10012-1-rodrigo.vivi@intel.com
2017-08-23 14:35:11 -07:00
Oscar Mateo
2cbecff412 drm/i915/cnl: WaPushConstantDereferenceHoldDisable
CS sometimes hangs on 3D Push Constant dispatches with the new
deref enhancement logic in CNL.

v2: Improve the commit message (Rodrigo)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1503518191-19116-1-git-send-email-oscar.mateo@intel.com
2017-08-23 13:31:18 -07:00
Rodrigo Vivi
d1d247543c drm/i915/cnl: WaDisableEnhancedSBEVertexCaching
WA forTDS handle reallocation getting dropped by SDE,
which may result in PS attribute corruption.

Disable enhanced SBE vertex caching in COMMON_SLICE_CHICKEN2 offset.

v2: Make it until B0 as spec tells. (by Mika).

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-3-rodrigo.vivi@intel.com
2017-08-18 15:32:58 -07:00
Rodrigo Vivi
e6d1a4f6b2 drm/i915/cnl: Add WaDisableReplayBufferBankArbitrationOptimization
WA to disable replay buffer destination buffer arbitration optimization.

Same Wa on previous platforms has a different name: WaToEnableHwFixForPushConstHWBug

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-2-rodrigo.vivi@intel.com
2017-08-18 15:32:34 -07:00
Rodrigo Vivi
90007bca61 drm/i915/cnl: Introduce initial Cannonlake Workarounds.
Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.

v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
    Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
    since they don't carry any gen10 related W/a. (by Oscar).
    Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
    many workarounds were changed to be A0 only so let's remove
    them.

Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com
2017-08-18 15:32:17 -07:00
Chris Wilson
4d53568cca drm/i915: Move idle checks before intel_engine_init_global_seqno()
intel_engine_init_globa_seqno() may be called from an uncontrolled
set-wedged path where we have given up waiting for broken hw and declare
it defunct. Along that path, any sanity checks that the hw is idle
before we adjust its state will expectedly fail, so we simply cannot.
Instead of asserting inside init_global_seqno, we move them to the
normal caller reset_all_global_seqno() as it handles runtime seqno
wraparound.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-8-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:46 +02:00
Chris Wilson
d6edb6e3b6 drm/i915: Check the execlist queue for pending requests before declaring idle
Including a check against the execlist queue before calling the engine
idle and passing hangcheck.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-6-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:45 +02:00
Rodrigo Vivi
f65f841789 drm/i915/cnl: Gen10 render context size.
No change on render context size is required for Gen10.

So this patch doesn't change the default behaviour,
but only avoid the missing_case message.

Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Link: http://patchwork.freedesktop.org/patch/msgid/1499375184-5725-1-git-send-email-rodrigo.vivi@intel.com
2017-07-07 09:12:16 -07:00
Rodrigo Vivi
98eed3d1ad drm/i915/cfl: Fix Workarounds.
During the review of Coffee Lake workarounds Mika pointed out
that WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC
should be removed from CFL and with that I should carry the rv-b.

However when doing the v2 I removed another Workaround that should
remain because although not mentioned by spec the history of hangs
around it advocates on its favor.

On some follow-up patches I continued operating on the wrong
workardound, but Ville noticed that, so here is the fix for the
current CFL code that is upstream already.

Fixes: 46c26662d2 ("drm/i915/cfl: Introduce Coffee Lake workarounds.")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
2017-06-29 13:08:21 -07:00
Chris Wilson
9cd90018eb drm/i915: Cancel pending execlists irq handler upon idling
Due to the slight asynchronicity in handling the execlists interrupts
(i.e. we defer the work to a handler that may consume more than one
interrupt event), when the engine is idle we may still have an irq
tasklet queued (especially when it has been deferred to a ksoftirqd).
At the beginning of the tasklet, we assert that we do hold a device
wakeref for the access we are about to perform. This assumes that when
we idle and release the GT wakeref, all execlists work has been
completed (since the elsp tracking says the hw is idle). However, there
may still be a tasklet queued, so as we mark the engine idle, also
cancel any pending tasklet.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170627152510.28589-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-06-28 11:30:33 +01:00
Tvrtko Ursulin
33def1ff7b drm/i915: Simplify intel_engines_init
We do not want to carry on over missing constructors and don't
need a duplicated engine mask checking which is already done
in the setup phase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-06-20 12:42:12 +01:00
Rodrigo Vivi
46c26662d2 drm/i915/cfl: Introduce Coffee Lake workarounds.
Coffee Lake inherit most of Kabylake production
workarounds.

v2: Fix typo on commit message and remove
    WaDisableKillLogic and GEN9_DISABLE_OCL_OOB_SUPPRESS_LOGIC,
    since as Mika pointed out they shouldn't be here for cfl
    according to BSpec.

Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1497653398-15722-1-git-send-email-rodrigo.vivi@intel.com
2017-06-16 16:14:29 -07:00
Chris Wilson
aed2fc102f drm/i915: Check the ring is empty when declaring the engines are idle
As another precaution when testing whether the CS engine is actually
idle, also inspect the ring's HEAD/TAIL registers, which should be equal
when there are no commands left to execute by the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170530121334.17364-3-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-06-01 14:22:16 +01:00
Chris Wilson
a091d4ee93 drm/i915: Hold a wakeref for probing the ring registers
Allow intel_engine_is_idle() to be called outside of the GT wakeref by
acquiring the device runtime pm for ourselves. This allows the function
to act as check after we assume the engine is idle and we release the GT
wakeref held whilst we have requests. At the moment, we do not call it
outside of an awake context but taking the wakeref as required makes it
more convenient to use for quick debugging in future.

[ 2613.401647] RPM wakelock ref not held during HW access
[ 2613.401684] ------------[ cut here ]------------
[ 2613.401720] WARNING: CPU: 5 PID: 7739 at drivers/gpu/drm/i915/intel_drv.h:1787 gen6_read32+0x21f/0x2b0 [i915]
[ 2613.401731] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm r8169 mii mei_me lpc_ich mei prime_numbers [last unloaded: i915]
[ 2613.401823] CPU: 5 PID: 7739 Comm: drv_missed_irq Tainted: G     U          4.12.0-rc2-CI-CI_DRM_421+ #1
[ 2613.401825] Hardware name: MSI MS-7924/Z97M-G43(MS-7924), BIOS V1.12 02/15/2016
[ 2613.401840] task: ffff880409e3a740 task.stack: ffffc900084dc000
[ 2613.401861] RIP: 0010:gen6_read32+0x21f/0x2b0 [i915]
[ 2613.401863] RSP: 0018:ffffc900084dfce8 EFLAGS: 00010292
[ 2613.401869] RAX: 000000000000002a RBX: ffff8804016a8000 RCX: 0000000000000006
[ 2613.401871] RDX: 0000000000000006 RSI: ffffffff81cbf2d9 RDI: ffffffff81c9e3a7
[ 2613.401874] RBP: ffffc900084dfd18 R08: ffff880409e3afc8 R09: 0000000000000000
[ 2613.401877] R10: 000000008a1c483f R11: 0000000000000000 R12: 000000000000209c
[ 2613.401879] R13: 0000000000000001 R14: ffff8804016a8000 R15: ffff8804016ac150
[ 2613.401882] FS:  00007f39ef3dd8c0(0000) GS:ffff88041fb40000(0000) knlGS:0000000000000000
[ 2613.401885] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2613.401887] CR2: 00000000023717c8 CR3: 00000002e7b34000 CR4: 00000000001406e0
[ 2613.401889] Call Trace:
[ 2613.401912]  intel_engine_is_idle+0x76/0x90 [i915]
[ 2613.401931]  i915_gem_wait_for_idle+0xe6/0x1e0 [i915]
[ 2613.401951]  fault_irq_set+0x40/0x90 [i915]
[ 2613.401970]  i915_ring_test_irq_set+0x42/0x50 [i915]
[ 2613.401976]  simple_attr_write+0xc7/0xe0
[ 2613.401981]  full_proxy_write+0x4f/0x70
[ 2613.401987]  __vfs_write+0x23/0x120
[ 2613.401992]  ? rcu_read_lock_sched_held+0x75/0x80
[ 2613.401996]  ? rcu_sync_lockdep_assert+0x2a/0x50
[ 2613.401999]  ? __sb_start_write+0xfa/0x1f0
[ 2613.402004]  vfs_write+0xc5/0x1d0
[ 2613.402008]  ? trace_hardirqs_on_caller+0xe7/0x1c0
[ 2613.402013]  SyS_write+0x44/0xb0
[ 2613.402020]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 2613.402022] RIP: 0033:0x7f39eded6670
[ 2613.402025] RSP: 002b:00007fffdcdcb1a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 2613.402030] RAX: ffffffffffffffda RBX: ffffffff81470203 RCX: 00007f39eded6670
[ 2613.402033] RDX: 0000000000000001 RSI: 000000000041bc33 RDI: 0000000000000006
[ 2613.402036] RBP: ffffc900084dff88 R08: 00007f39ef3dd8c0 R09: 0000000000000001
[ 2613.402038] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000041bc33
[ 2613.402041] R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000
[ 2613.402046]  ? __this_cpu_preempt_check+0x13/0x20
[ 2613.402052] Code: 01 9b fa e0 0f ff e9 28 fe ff ff 80 3d 6a dd 0e 00 00 0f 85 29 fe ff ff 48 c7 c7 48 19 29 a0 c6 05 56 dd 0e 00 01 e8 da 9a fa e0 <0f> ff e9 0f fe ff ff b9 01 00 00 00 ba 01 00 00 00 44 89 e6 48
[ 2613.402199] ---[ end trace 31f0cfa93ab632bf ]---

Fixes: 5400367a86 ("drm/i915: Ensure the engine is idle before manually changing HWS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170530121334.17364-2-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-05-30 17:18:32 +01:00
Chris Wilson
6c067579e6 drm/i915: Split execlist priority queue into rbtree + linked list
All the requests at the same priority are executed in FIFO order. They
do not need to be stored in the rbtree themselves, as they are a simple
list within a level. If we move the requests at one priority into a list,
we can then reduce the rbtree to the set of priorities. This should keep
the height of the rbtree small, as the number of active priorities can not
exceed the number of active requests and should be typically only a few.

Currently, we have ~2k possible different priority levels, that may
increase to allow even more fine grained selection. Allocating those in
advance seems a waste (and may be impossible), so we opt for allocating
upon first use, and freeing after its requests are depleted. To avoid
the possibility of an allocation failure causing us to lose a request,
we preallocate the default priority (0) and bump any request to that
priority if we fail to allocate it the appropriate plist. Having a
request (that is ready to run, so not leading to corruption) execute
out-of-order is better than leaking the request (and its dependency
tree) entirely.

There should be a benefit to reducing execlists_dequeue() to principally
using a simple list (and reducing the frequency of both rbtree iteration
and balancing on erase) but for typical workloads, request coalescing
should be small enough that we don't notice any change. The main gain is
from improving PI calls to schedule, and the explicit list within a
level should make request unwinding simpler (we just need to insert at
the head of the list rather than the tail and not have to make the
rbtree search more complicated).

v2: Avoid use-after-free when deleting a depleted priolist

v3: Michał found the solution to handling the allocation failure
gracefully. If we disable all priority scheduling following the
allocation failure, those requests will be executed in fifo and we will
ensure that this request and its dependencies are in strict fifo (even
when it doesn't realise it is only a single list). Normal scheduling is
restored once we know the device is idle, until the next failure!
Suggested-by: Michał Wajdeczko <michal.wajdeczko@intel.com>

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-8-chris@chris-wilson.co.uk
2017-05-17 13:38:09 +01:00
Chris Wilson
77f0d0e925 drm/i915/execlists: Pack the count into the low bits of the port.request
add/remove: 1/1 grow/shrink: 5/4 up/down: 391/-578 (-187)
function                                     old     new   delta
execlists_submit_ports                       262     471    +209
port_assign.isra                               -     136    +136
capture                                     6344    6359     +15
reset_common_ring                            438     452     +14
execlists_submit_request                     228     238     +10
gen8_init_common_ring                        334     341      +7
intel_engine_is_idle                         106     105      -1
i915_engine_info                            2314    2290     -24
__i915_gem_set_wedged_BKL                    485     411     -74
intel_lrc_irq_handler                       1789    1604    -185
execlists_update_context                     294       -    -294

The most important change there is the improve to the
intel_lrc_irq_handler and excclist_submit_ports (net improvement since
execlists_update_context is now inlined).

v2: Use the port_api() for guc as well (even though currently we do not
pack any counters in there, yet) and hide all port->request_count inside
the helpers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170517121007.27224-5-chris@chris-wilson.co.uk
2017-05-17 13:38:06 +01:00
Arkadiusz Hiler
0b71cea29f drm/i915/gen9: Reintroduce WaEnableYV12BugFixInHalfSliceChicken7
This basically reverts commit 465418c606
("drm/i915/gen9: Remove WaEnableYV12BugFixInHalfSliceChicken7")
with small addition - marking it as affecting GLK as well.

It was incorrectly considered fixed in production steppings.

References: HSD#2126385, HSD#2131381, HSDES#1504433555, BSID#0764
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Mika: s/KBL/GLK on commit message]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170512112015.19082-1-arkadiusz.hiler@intel.com
2017-05-17 14:30:27 +03:00
Chris Wilson
266a240bf0 drm/i915: Use engine->context_pin() to report the intel_ring
Since unifying ringbuffer/execlist submission to use
engine->pin_context, we ensure that the intel_ring is available before
we start constructing the request. We can therefore move the assignment
of the request->ring to the central i915_gem_request_alloc() and not
require it in every engine->request_alloc() callback. Another small step
towards simplification (of the core, but at a cost of handling error
pointers in less important callers of engine->pin_context).

v2: Rearrange a few branches to reduce impact of PTR_ERR() on gcc's code
generation.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170504093308.4137-1-chris@chris-wilson.co.uk
2017-05-04 11:54:43 +01:00
Joonas Lahtinen
63ffbcdadc drm/i915: Sanitize engine context sizes
Pre-calculate engine context size based on engine class and device
generation and store it in the engine instance.

v2:
- Squash and get rid of hw_context_size (Chris)

v3:
- Move after MMIO init for probing on Gen7 and 8 (Chris)
- Retained rounding (Tvrtko)
v4:
- Rebase for deferred legacy context allocation

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: intel-gvt-dev@lists.freedesktop.org
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-28 12:11:59 +03:00
Chris Wilson
546cdbc75b drm/i915: Stop touching hangcheck.seqno from intel_engine_init_global_seqno()
The hangcheck runs independently to the main flow of seqno through the
driver. However, we have an odd coupling of the seqno reset that is
unwelcome, and if poked at just the right rate can cause spurious hangs
(e.g. gem_exec_whisper) on an apparently idle engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170421083113.21321-1-chris@chris-wilson.co.uk
2017-04-21 09:58:24 +01:00
Chris Wilson
8968a3649a drm/i915: Pretend the engine is always idle when mocking
If we have a mock engine and it has no more requests in flight, report
it as idle as there is no hardware to contradict us! Otherwise we
attempt to query the hw that doesn't exist and find that the hw hasn't
set its idle bit and we get upset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170411234427.14841-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-04-12 13:37:24 +01:00
Chris Wilson
a8e9a419c3 drm/i915: Lie and treat all engines as idle if wedged
Similar to commit 8490ae207f ("drm/i915: Suppress busy status for
engines if wedged") we also want to report intel_engine_is_idle() as
true as well as the main intel_engines_are_idle(), as we now check that
the engines are idle when overwriting the HWS page. This is not true
whilst we are setting the device as wedged, at least according to our
bookkeeping, so we have to lie to ourselves!

[  383.588601] [drm:i915_reset [i915]] *ERROR* Failed to reset chip: -110
[  383.588685] ------------[ cut here ]------------
[  383.588755] WARNING: CPU: 0 PID: 12 at drivers/gpu/drm/i915/intel_engine_cs.c:226 intel_engine_init_global_seqno+0x222/0x290 [i915]
[  383.588757] WARN_ON(!intel_engine_is_idle(engine))
[  383.588759] Modules linked in: ctr ccm snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core arc4 iwldvm mac80211 snd_pcm snd_hwdep snd_seq_midi snd_seq_midi_event rfcomm bnep snd_rawmidi intel_powerclamp coretemp dm_multipath iwlwifi crct10dif_pclmul snd_seq crc32_pclmul ghash_clmulni_intel btusb aesni_intel btrtl btbcm aes_x86_64 crypto_simd cryptd btintel snd_timer glue_helper bluetooth intel_ips snd_seq_device cfg80211 snd soundcore binfmt_misc mei_me mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers ahci libahci drm e1000e
[  383.588851] CPU: 0 PID: 12 Comm: migration/0 Not tainted 4.11.0-rc5+ #207
[  383.588853] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[  383.588855] Call Trace:
[  383.588866]  dump_stack+0x63/0x90
[  383.588871]  __warn+0xc7/0xf0
[  383.588876]  warn_slowpath_fmt+0x4a/0x50
[  383.588883]  ? set_next_entity+0x821/0x910
[  383.588943]  intel_engine_init_global_seqno+0x222/0x290 [i915]
[  383.588998]  __i915_gem_set_wedged_BKL+0xa4/0x190 [i915]
[  383.589003]  ? __switch_to+0x215/0x390
[  383.589008]  multi_cpu_stop+0xbb/0xe0
[  383.589012]  ? cpu_stop_queue_work+0x90/0x90
[  383.589016]  cpu_stopper_thread+0x82/0x110
[  383.589021]  smpboot_thread_fn+0x137/0x190
[  383.589026]  kthread+0xf7/0x130
[  383.589030]  ? sort_range+0x20/0x20
[  383.589034]  ? kthread_park+0x90/0x90
[  383.589040]  ret_from_fork+0x2c/0x40

Fixes: 2ca9faa551 ("drm/i915: Assert the engine is idle before overwiting the HWS")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170411190042.25662-1-chris@chris-wilson.co.uk
2017-04-11 20:49:41 +01:00
Chris Wilson
5f9be05432 drm/i915: Bail if we do not setup the RCS engine
In places, we assume that RCS exists. This has been true forever, but
let us catch this failure during bringup by adding an explicit check
that we do have an RCS engine.

v2: Make use of HAS_ENGINE (Tvrtko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170411165658.23828-1-chris@chris-wilson.co.uk
2017-04-11 18:59:00 +01:00
Chris Wilson
1d39f28170 drm/i915: Rename intel_engine_cs.exec_id to uabi_id
We want to refer to the index of the engine consistently throughout the
userspace ABI. We already have such an index through the execbuffer
engine specifier, that needs to be able to refer to each engine
specifically, so rename it the index to uabi_id to reflect its
generality beyond execbuf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170411124306.15448-1-chris@chris-wilson.co.uk
2017-04-11 14:31:39 +01:00
Oscar Mateo
b8400f01fc drm/i915: Split the engine info table in two levels, using class + instance
There are some properties that logically belong to the engine class, and some
that belong to the engine instance. Make it explicit.

v2: Commit message (Tvrtko)

v3:
  - Rebased
  - Exec/uabi id should be per instance (Chris)

v4:
  - Rebased
  - Avoid re-ordering fields for smaller diff (Tvrtko)
  - Bug on oob access to the class array (Michal)

v5: Bug on the right thing (Michal)

v6: Rebased

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-5-git-send-email-oscar.mateo@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 13:00:33 +01:00
Oscar Mateo
6e516148f3 drm/i915: Generate the engine name based on the instance number
Not really needed, but makes the next change a little bit more compact.

v2:
  - Use zero-based numbering for engine names: xcs0, xcs1.. xcsN (Tvrtko, Chris)
  - Make sure the mock engine name is null-terminated (Tvrtko, Chris)

v3: Because I'm stupid (Chris)

v4: Verify engine name wasn't truncated (Michal)

v5:
  - Kill the warning in mock engine (Chris)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-4-git-send-email-oscar.mateo@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 12:58:10 +01:00
Oscar Mateo
5ff36d36a3 drm/i915: Use the same vfunc for BSD2 ring init
If we needed to do something different for the init functions, we could
always look at the engine instance to make the distinction. But, in any
case, the two functions are virtually identical already (please notice
that BSD2_RING is only used from gen8 onwards).

With this, the init functions depends excusively on the engine class
(a fact that we will use soon).

v2: Commit message

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-3-git-send-email-oscar.mateo@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 12:57:51 +01:00
Daniele Ceraolo Spurio
0908180b9a drm/i915: Classify the engines in class + instance
In such a way that vcs and vcs2 are just two different instances (0 and 1)
of the same engine class (VIDEO_DECODE_CLASS).

v2: Align the instance types (Tvrtko)

v3: Don't use enums for bspec-defined stuff (Michal)

Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1491834873-9345-2-git-send-email-oscar.mateo@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-04-11 12:57:31 +01:00
Chris Wilson
2ca9faa551 drm/i915: Assert the engine is idle before overwiting the HWS
When we update the global seqno (on the engine timeline), we modify HW
state (both registers and mapped pages). As we do this, we should be
sure that the HW is idle and we are not causing a conflict. The caller
is supposed to wait_for_idle before calling us to update the seqno, so
let's assert they have and the engine is indeed idle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170405153055.28123-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-04-07 10:10:13 +01:00
Chris Wilson
8490ae207f drm/i915: Suppress busy status for engines if wedged
If the driver is wedged, HW state may be very inconsistent and
report that it is still busy, even though we have stopped using it. This
can lead to a double *ERROR* rather than a graceful cleanup after
wedging.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170330145041.9005-2-chris@chris-wilson.co.uk
2017-03-30 17:58:44 +01:00
Chris Wilson
17ab792ab1 drm/i915: Drop verbose and archaic "ring" from our internal engine names
We pretty print the name of an engine in several places, mostly for
debug, but also in the GPU hang report. Using "ring" in the name is
archaic (we call those engines now to differentiate them from the
multiple rings of commands we execute on each engine), quite verbose and
often tautological. We run out of room in our GPU hang report for
instance if we have more than a couple of engines hung simultaneously.
Bit the bullet and update the strings to reflect the common internal names.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170330134820.12273-1-chris@chris-wilson.co.uk
2017-03-30 15:32:19 +01:00
Chris Wilson
24caf65593 drm/i915: intel_engine_init_global_seqno() requires atomic kmap
As intel_engine_init_global_seqno() may be called by
nop_submit_request() from inside irq context, we have to use atomic
versions of kmap/kunmap. This is rare as this requires using gen8 legacy
ringbuffer submission.

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170320145609.4898-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-21 09:21:14 +00:00
Chris Wilson
ff44ad51eb drm/i915: Move engine->submit_request selection to a vfunc
It turns out that we may want to restore the original
engine->submit_request (and engine->schedule) callbacks from more than
just the guc <-> execlists transition. Move this to a vfunc so we can
have a common interface.

v2: Move initial selection to intel_engines_init_common(), repaint vfunc
with engine->set_default_submission (and a similar colour for the
helper).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-2-chris@chris-wilson.co.uk
2017-03-16 17:17:12 +00:00
Changbin Du
3fc03069bc drm/i915: make context status notifier head be per engine
GVTg has introduced the context status notifier to schedule the GVTg
workload. At that time, the notifier is bound to GVTg context only,
so GVTg is not aware of host workloads.

Now we are going to improve GVTg's guest workload scheduler policy,
and add Guc emulation support for new Gen graphics. Both these two
features require acknowledgment for all contexts running on hardware.
(But will not alter host workload.) So here try to make some change.

The change is simple:
  1. Move the context status notifier head from i915_gem_context to
     intel_engine_cs. Which means there is a notifier head per engine
     instead of per context. Execlist driver still call notifier for
     each context sched-in/out events of current engine.
  2. At GVTg side, it binds a notifier_block for each physical engine
     at GVTg initialization period. Then GVTg can hear all context
     status events.

In this patch, GVTg do nothing for host context event, but later
will add a function there. But in any case, the notifier callback is
a noop if this is no active vGPU.

Since intel_gvt_init() is called at early initialization stage and
require the status notifier head has been initiated, I initiate it in
intel_engine_setup().

v2: remove a redundant newline. (chris)

Fixes: 3c7ba6359d ("drm/i915: Introduce execlist context status change notification")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232
Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.com
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-16 16:24:35 +00:00
Chris Wilson
14a6bbf9e5 drm/i915: Replace irq_seqno_barrier on hws write with a clflush
When manually overwriting the HWS, rather than assume irq_seqno_barrier
does the right thing, we can explicitly flush the cacheline instead.
This avoids us calling the engine->irq_seqno_barrier() from an illegal
context:

[ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002
[ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci
[ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G     U          4.11.0-rc1+ #203
[ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[ 1472.651900] Call Trace:
[ 1472.651913]  dump_stack+0x63/0x90
[ 1472.651922]  __schedule_bug+0x5d/0x6b
[ 1472.651930]  __schedule+0x46a/0x5f0
[ 1472.651934]  schedule+0x38/0x90
[ 1472.651938]  schedule_hrtimeout_range_clock+0x85/0x110
[ 1472.651945]  ? hrtimer_init+0x10/0x10
[ 1472.651949]  schedule_hrtimeout_range+0xe/0x10
[ 1472.651952]  usleep_range+0x4d/0x60
[ 1472.652037]  gen5_seqno_barrier+0x13/0x20 [i915]
[ 1472.652101]  intel_engine_init_global_seqno+0xd7/0x160 [i915]
[ 1472.652160]  __i915_gem_set_wedged_BKL+0xa0/0x180 [i915]
[ 1472.652166]  multi_cpu_stop+0xbb/0xe0
[ 1472.652170]  ? cpu_stop_queue_work+0x90/0x90
[ 1472.652174]  cpu_stopper_thread+0x82/0x110
[ 1472.652179]  smpboot_thread_fn+0x137/0x190
[ 1472.652184]  kthread+0xf7/0x130
[ 1472.652187]  ? sort_range+0x20/0x20
[ 1472.652191]  ? kthread_park+0x90/0x90
[ 1472.652195]  ret_from_fork+0x2c/0x40

Testcase: igt/gem_eio #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-03-16 14:26:28 +00:00
Michal Wajdeczko
237ae7c79e drm/i915: Don't use enums for hardware engine id
Generally we are using macros for any hardware identifiers as these
may change between Gens. Do the same with hardware engine ids.

v2: move hw engine defs to i915_reg.h (Chris)

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170301202615.118632-1-michal.wajdeczko@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-03 22:54:21 +00:00
Chris Wilson
0542524944 drm/i915: Generalise wait for execlists to be idle
The code to check for execlists completion is generic, so move it to
intel_engine_cs.c, where we can reuse the new intel_engine_is_idle().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303121947.20482-2-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-03-03 13:08:15 +00:00
Chris Wilson
5400367a86 drm/i915: Ensure the engine is idle before manually changing HWS
During reset_all_global_seqno() on seqno rollover, we have to update the
HWS. This causes all in flight requests to be completed, so first we
wait. However, we were only waiting for the requests themselves to be
completed and clearing out the waiter rbtrees - what I had missed was
the extra reference in execlists->port[]. Since commit fe9ae7a3bf
("drm/i915/execlists: Detect an out-of-order context switch") we can
detect when the request is retired before the context switch interrupt
is completed. The impact should be neglible outside of debugging.

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303121947.20482-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-03-03 13:08:04 +00:00
Chris Wilson
02e012f172 drm/i915: Move w/a LRI debug message from context-init to driver load
The spam of every context initialisation saying the same thing is annoying
me! Move the information to the setup of the engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170301121131.11588-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2017-03-01 20:29:24 +00:00
Chris Wilson
9b6586ae9f drm/i915: Keep a global seqno per-engine
Replace the global device seqno with one for each engine, and account
for in-flight seqno on each separately. This is consistent with
dma-fence as each timeline has separate fence-contexts for each engine
and a seqno is only ordered within a fence-context (i.e.  seqno do not
need to be ordered wrt to other engines, just ordered within a single
engine). This is required to enable request rewinding for preemption on
individual engines (we have to rewind the global seqno to avoid
overflow, and we do not have to rewind all engines just to preempt one.)

v2: Rename active_seqno to inflight_seqnos to more clearly indicate that
it is a counter and not equivalent to the existing seqno. Update
functions that operated on active_seqno similarly.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170223074422.4125-3-chris@chris-wilson.co.uk
2017-02-23 14:49:26 +00:00
Tvrtko Ursulin
133b4bd74d drm/i915: Move common workaround code to intel_engine_cs
It is used by all submission backends.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-17 11:39:59 +00:00
Tvrtko Ursulin
8ee7c6e23b drm/i915: Simplify cleanup path in intel_engines_init
We can call the engine cleanup vfunc instead of duplicating the
decision making here.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-17 11:39:59 +00:00