Commit Graph

1018 Commits

Author SHA1 Message Date
Ben Widawsky
d636951ec0 drm/i915: Cleanup instdone collection
Consolidate the instdone logic so we can get a bit fancier. This patch also
removes the duplicated print of INSTDONE[0].

v2: (Imre)
- Rebased on top of hangcheck INSTDONE changes.
- Move all INSTDONE registers into a single struct, store it within the
  engine error struct during error capturing.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1474379673-28326-1-git-send-email-imre.deak@intel.com
2016-09-21 15:32:36 +03:00
Dave Gordon
b20e3cfe4b drm/i915: clarify PMINTRMSK/pm_intr_keep usage
No functional changes; just renaming a bit, tweaking a datatype,
prettifying layout, and adding comments, in particular in the
GuC setup code that touches this data.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473711577-11454-2-git-send-email-david.s.gordon@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-09-15 10:55:43 +01:00
Chris Wilson
32c2b4bda6 drm/i915: Avoid incrementing hangcheck whilst waiting for external fence
If we are waiting upon an external fence, from the pov of hangcheck the
engine is stuck on the last submitted seqno. Currently we give a small
increment to the hangcheck score in order to catch a stuck waiter /
driver. Now that we both have an independent wait hangcheck and may be
stuck waiting on an external fence, resetting the GPU has little effect
on that external fence. As we cannot advance by resetting, skip
incrementing the hangcheck score.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-19-chris@chris-wilson.co.uk
2016-09-09 14:23:07 +01:00
Chris Wilson
80b5bdbdcb drm/i915: Ignore valid but unknown semaphores
If we find a ring waiting on a semaphore for another assigned but not yet
emitted request, treat it as valid and waiting.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-18-chris@chris-wilson.co.uk
2016-09-09 14:23:07 +01:00
Chris Wilson
780f262a70 drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker
Since we have a cooperative mode now with a direct reset, we can avoid
the contention on struct_mutex and instead try then sleep on the
I915_RESET_IN_PROGRESS bit. If the mutex is held and that bit is
cleared, all is fine. Otherwise, we sleep for a bit and try again. In
the worst case we sleep for an extra second waiting for the mutex to be
released (no one touching the GPU is allowed the struct_mutex whilst the
I915_RESET_IN_PROGRESS bit is set). But when we have a direct reset,
this allows us to clean up the reset worker faster.

v2: Remember to call wake_up_bit() after changing (for the faster wakeup
as promised)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-12-chris@chris-wilson.co.uk
2016-09-09 14:23:04 +01:00
Chris Wilson
221fe79945 drm/i915: Perform a direct reset of the GPU from the waiter
If a waiter is holding the struct_mutex, then the reset worker cannot
reset the GPU until the waiter returns. We do not want to return -EAGAIN
form i915_wait_request as that breaks delicate operations like
i915_vma_unbind() which often cannot be restarted easily, and returning
-EIO is just as useless (and has in the past proven dangerous). The
remaining WARN_ON(i915_wait_request) serve as a valuable reminder that
handling errors from an indefinite wait are tricky.

We can keep the current semantic that knowing after a reset is complete,
so is the request, by performing the reset ourselves if we hold the
mutex.

uevent emission is still handled by the reset worker, so it may appear
slightly out of order with respect to the actual reset (and concurrent
use of the device).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-11-chris@chris-wilson.co.uk
2016-09-09 14:23:04 +01:00
Chris Wilson
8af29b0c78 drm/i915: Separate out reset flags from the reset counter
In preparation for introducing a per-engine reset, we can first separate
the mixing of the reset state from the global reset counter.

The loss of atomicity in updating the reset state poses a small problem
for handling the waiters. For requests, this is solved by advancing the
seqno so that a waiter waking up after the reset knows the request is
complete. For pending flips, we still rely on the increment of the
global reset epoch (as well as the reset-in-progress flag) to signify
when the hardware was reset.

The advantage, now that we do not inspect the reset state during reset
itself i.e. we no longer emit requests during reset, is that we can use
the atomic updates of the state flags to ensure that only one reset
worker is active.

v2: Mika spotted that I transformed the i915_gem_wait_for_error() wakeup
into a waiter wakeup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-6-git-send-email-arun.siluvery@linux.intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-7-chris@chris-wilson.co.uk
2016-09-09 14:23:02 +01:00
Chris Wilson
bafb0fced9 drm/i915: Make for_each_engine_masked() more compact and quicker
Rather than walk the full array of engines checking whether each is in
the mask in turn, we can use the mask to jump to the right engines. This
should quicker for a sparse array of engines or mask, whilst generating
smaller code:

   text	   data	    bss	    dec	    hex	filename
1251010	   4579	    800	1256389	 132bc5	drivers/gpu/drm/i915/i915.ko
1250530	   4579	    800	1255909	 1329e5	drivers/gpu/drm/i915/i915.ko

The downside is that we have to pass in a temporary, alas no C99
iterators yet.

[P.S. Joonas doesn't like having to pass extra temporaries into the
macro, and even less that I called them tmp. As yet, we haven't found a
macro that avoids passing in a temporary that is smaller. We probably
will get C99 iterators first!]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160827075401.16470-2-chris@chris-wilson.co.uk
2016-08-27 09:42:53 +01:00
Chris Wilson
34730fed9f drm/i915: Ignore stuck requests when considering hangs
If the engine isn't being retired (worker starvation?) then it is
possible for us to repeatedly observe that between consecutive
hangchecks the seqno on the ring to be the same and there remain
unretired requests. Ignore these completely and only regard the engine
as busy for the purpose of hang detection (not stall detection) if there
are outstanding breadcrumbs.

In recent history we have looked at using both the request and seqno as
indication of activity on the engine, but that was reduced to just
inspecting seqno in commit cffa781e59 ("drm/i915: Simplify check for
idleness in hangcheck"). However, in commit dcff85c844 ("drm/i915:
Enable i915_gem_wait_for_idle() without holding struct_mutex"), I made
the decision to use the new common lockless function, under the
assumption that request retirement was more frequent than hangcheck and
so we would not have a stuck busy check. The flaw there was in
forgetting that we accumulate the hang score, and so successive checks
seeing a stuck request, albeit with the GPU advancing elsewhere and so
not necessary the same stuck request, would eventually trigger the hang.

Fixes: dcff85c844 ("drm/i915: Enable i915_gem_wait_for_idle()...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160820145408.32180-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-22 13:09:11 +01:00
Chris Wilson
83348ba84e drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs
In commit 2529d57050 ("drm/i915: Drop racy markup of missed-irqs from
idle-worker") the racy detection of missed interrupts was removed when
we went idle. This however opened up the issue that the stuck waiters
were not being reported, causing a test case failure. If we move the
stuck waiter detection out of hangcheck and into the breadcrumb
mechanims (i.e. the waiter) itself, we can avoid this issue entirely.
This leaves hangcheck looking for a stuck GPU (inspecting for request
advancement and HEAD motion), and breadcrumbs looking for a stuck
waiter - hopefully make both easier to understand by their segregation.

v2: Reduce the error message as we now run independently of hangcheck,
and the hanging batch used by igt also counts as a stuck waiter causing
extra warnings in dmesg.
v3: Move the breadcrumb's hangcheck kickstart to the first missed wait.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97104
Fixes: 2529d57050 (waiter"drm/i915: Drop racy markup of missed-irqs...")
Testcase: igt/drv_missed_irq
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-2-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:35 +01:00
Rodrigo Vivi
4194c088df drm/i915: Use drm official vblank_no_hw_counter callback.
No functional change. Instead of defining a new empty function
let's use what is available on drm.

It gets cleaner, and easy to read, and understand.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2016-08-05 09:40:07 -07:00
Chris Wilson
dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson
7e37f889b5 drm/i915: Rename struct intel_ringbuffer to struct intel_ring
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:16 +01:00
Chris Wilson
9930ca1ae7 drm/i915: Update a couple of hangcheck comments to talk about engines
We still have lots of comments that refer to the old ring when we mean
struct intel_engine_cs and its hardware correspondence. This patch fixes
an instance inside hangcheck to talk about engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-10-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-5-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Chris Wilson
f2f0ed718b drm/i915: Rename ring->virtual_start as ring->vaddr
Just a different colour to better match virtual addresses elsewhere.

s/ring->virtual_start/ring->vaddr/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-8-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:39 +01:00
Chris Wilson
406ea8d22f drm/i915: Treat ringbuffer writes as write to normal memory
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (but even if it were not, it is still an uncached register write
and so serialising anyway.).

For simplicity, let's ignore the iomem annotation.

v2: Remove iomem from ringbuffer->virtual_address
v3: And for good measure add iomem elsewhere to keep sparse happy

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:14 +01:00
Chris Wilson
29ecd78d3b drm/i915: Define a separate variable and control for RPS waitboost frequency
To allow the user finer control over waitboosting, allow them to set the
frequency we request for the boost. This also them allows to effectively
disable the boosting by setting the boost request to a low frequency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-14 15:22:58 +01:00
Rodrigo Vivi
22dea0be50 drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
Some Kabylake SKUs are going to use Kabypoint PCH.
It is mainly for Halo and DT ones.

>From our specs it doesn't seem that KBP brings
any change on the display south engine. So let's consider
this as a continuation of SunrisePoint, i.e., SPT+.

Since it is easy to get confused by a letter change:
KBL = Kabylake - CPU/GPU codename.
KBP = Kabypoint - PCH codename.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96826
Link: http://patchwork.freedesktop.org/patch/msgid/1467418032-15167-1-git-send-email-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2016-07-07 10:01:37 -07:00
Chris Wilson
aca34b6e1c drm/i915: Group the irq breadcrumb variables into the same cacheline
As we inspect both the tasklet (to check for an active bottom-half) and
set the irq-posted flag at the same time (both in the interrupt handler
and then in the bottom-halt), group those two together into the same
cacheline. (Not having total control over placement of the struct means
we can't guarantee the cacheline boundary, we need to align the kmalloc
and then each struct, but the grouping should help.)

v2: Try a couple of different names for the state touched by the user
interrupt handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk
2016-07-06 12:47:39 +01:00
Chris Wilson
91c8a326a1 drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05 11:58:45 +01:00
Chris Wilson
2b2842887d drm/i915: Clean up GPU hang message
Remove some redundant kernel messages as we deduce a hung GPU and
capture the error state.

v2: Fix "hang" vs "no progress" message whilst I was there
v3: s/snprintf/scnprintf/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05 11:25:11 +01:00
Chris Wilson
b1379d4964 drm/i915: Replace lockless_dereference(bool) with READ_ONCE()
After Joonas complained about using READ_ONCE() on the only use of the
variable in the function, where the intent was to simply document that
the read was intentionally racy and unlocked, I switched the READ_ONCE()
over to lockless_dereference(). However, in linux-next that has a
stronger type-check to only allow pointers and is no longer
interchangeable with READ_ONCE(), see commit 331b6d8c7a
("locking/barriers: Validate lockless_dereference() is used on a pointer
type")

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 67d97da349 ("drm/i915: Only start retire worker when idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05 10:58:07 +01:00
Chris Wilson
fac5e23e3c drm/i915: Mass convert dev->dev_private to to_i915(dev)
Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().

   text    data     bss     dec     hex filename
1073824    4562     416 1078802  107612 drivers/gpu/drm/i915/i915.ko
1068976    4562     416 1073954  106322 drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:

@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 12:54:07 +01:00
Chris Wilson
c33d247d0e drm/i915: Flush the RPS bottom-half when the GPU idles
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e. between gen6_rps_busy() and gen6_rps_idle()).

v2: Avoid recursively using the i915->wq - RPS does not touch the
struct_mutex so has no place being on the ordered i915->wq.
v3: Enable/disable interrupts for RPS busy/idle in order to prevent
further HW access from RPS outside of the wakeref.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=89728
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-6-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:23 +01:00
Chris Wilson
67d97da349 drm/i915: Only start retire worker when idle
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct locking in the worker to make the hot path of request
submission cheap. To keep the symmetry and keep hangcheck strictly bound
by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle
worker before dropping the wakelock.

v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick
the waiter.
v3: Remove the wakeref assertion squelching (now we hold a wakeref for
the hangcheck, any rpm error there is genuine).
v4: To prevent excess work when retiring requests, we split the busy
flag into two, a boolean to denote whether we hold the wakeref and a
bitmask of active engines.
v5: Reorder cancelling hangcheck upon idling to avoid a race where we
might cancel a hangcheck after being preempted by a new task

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=88437
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:19 +01:00
Chris Wilson
c5a7b5aace drm/i915: Remove debug noise on detecting fault-injection of missed interrupts
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.

Note that I have chose to remove it entirely because when we detect the
"missed interrupt" is irrelevant and the message contains no more
information than we glean from looking in debugfs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-20-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:04:17 +01:00
Chris Wilson
31bb59cc01 drm/i915: Move the get/put irq locking into the caller
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.

For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions to enable/disable respectively and are
able to review the use of posting reads. We only require the
serialisation with hardware when enabling the interrupt (i.e. so we
cannot miss an interrupt by going to sleep before the hardware truly
enables it).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:04:16 +01:00
Chris Wilson
3d5564e910 drm/i915: Only apply one barrier after a breadcrumb interrupt is posted
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).

v2: Use cmpxchg to replace READ_ONCE/WRITE_ONCE for more explicit
control of the ordering wrt to interrupt generation and interrupt
checking in the bottom-half.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-14-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:54 +01:00
Chris Wilson
f8973c217f drm/i915: Add a delay between interrupt and inspecting the final seqno (ilk)
On Ironlake, there is no command nor register to ensure that the write
from a MI_STORE command is completed (and coherent on the CPU) before the
command parser continues. This means that the ordering between the seqno
write and the subsequent user interrupt is undefined (like gen6+). So to
ensure that the seqno write is completed after the final user interrupt
we need to delay the read sufficiently to allow the write to complete.
This delay is undefined by the bspec, and empirically requires 75us even
though a register read combined with a clflush is less than 500ns. Hence,
the delay is due to an on-chip buffer rather than the latency of the write
to memory.

Note that the render ring controls this by filling the PIPE_CONTROL fifo
with stalling commands that force the earliest pipe-control with the
seqno to be completed before the command parser continues. Given that we
need a barrier operation for BSD, we may as well forgo the extra
per-batch latency by using a common per-interrupt barrier.

Studying the impact of adding the usleep shows that in both sequences of
and individual synchronous no-op batches is negligible for the media
engine (where the write now is unordered with the interrupt). Converting
the render engine over from the current glutton of pie-controls over to
the per-interrupt delays speeds up both the sequential and individual
synchronous no-ops by 20% and 60%, respectively. This speed up holds
even when looking at the throughput of small copies (4KiB->4MiB), both
serial and synchronous, by about 20%. This is because despite adding a
significant delay to the interrupt, in all likelihood we will see the
seqno write without having to apply the barrier (only in the rare corner
cases where the write is delayed on the last required is the delay
necessary).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94307
Testcase: igt/gem_sync #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-12-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:53 +01:00
Chris Wilson
1b7744e7ba drm/i915: Use HWS for seqno tracking everywhere
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

v2: Fix semaphore_passed() to look at the signaling engine (not the
waiter's)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:48 +01:00
Chris Wilson
688e6c7258 drm/i915: Slaughter the thundering i915_wait_request herd
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one.

Ideally, we only want one client to wake up after the interrupt and
check its request for completion. Since the requests must retire in
order, we can select the first client on the oldest request to be woken.
Once that client has completed his wait, we can then wake up the
next client and so on. However, all clients then incur latency as every
process in the chain may be delayed for scheduling - this may also then
cause some priority inversion. To reduce the latency, when a client
is added or removed from the list, we scan the tree for completed
seqno and wake up all the completed waiters in parallel.

Using igt/benchmarks/gem_latency, we can demonstrate this effect. The
benchmark measures the number of GPU cycles between completion of a
batch and the client waking up from a call to wait-ioctl. With many
concurrent waiters, with each on a different request, we observe that
the wakeup latency before the patch scales nearly linearly with the
number of waiters (before external factors kick in making the scaling much
worse). After applying the patch, we can see that only the single waiter
for the request is being woken up, providing a constant wakeup latency
for every operation. However, the situation is not quite as rosy for
many waiters on the same request, though to the best of my knowledge this
is much less likely in practice. Here, we can observe that the
concurrent waiters incur extra latency from being woken up by the
solitary bottom-half, rather than directly by the interrupt. This
appears to be scheduler induced (having discounted adverse effects from
having a rbtree walk/erase in the wakeup path), each additional
wake_up_process() costs approximately 1us on big core. Another effect of
performing the secondary wakeups from the first bottom-half is the
incurred delay this imposes on high priority threads - rather than
immediately returning to userspace and leaving the interrupt handler to
wake the others.

To offset the delay incurred with additional waiters on a request, we
could use a hybrid scheme that did a quick read in the interrupt handler
and dequeued all the completed waiters (incurring the overhead in the
interrupt handler, not the best plan either as we then incur GPU
submission latency) but we would still have to wake up the bottom-half
every time to do the heavyweight slow read. Or we could only kick the
waiters on the seqno with the same priority as the current task (i.e. in
the realtime waiter scenario, only it is woken up immediately by the
interrupt and simply queues the next waiter before returning to userspace,
minimising its delay at the expense of the chain, and also reducing
contention on its scheduler runqueue). This is effective at avoid long
pauses in the interrupt handler and at avoiding the extra latency in
realtime/high-priority waiters.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
v5: Fix race in locklessly checking waiter status and kicking the task on
adding a new waiter.
v6: Fix deciding when to force the timer to hide missing interrupts.
v7: Move the bottom-half from the kthread to the first client process.
v8: Reword a few comments
v9: Break the busy loop when the interrupt is unmasked or has fired.
v10: Comments, unnecessary churn, better debugging from Tvrtko
v11: Wake all completed waiters on removing the current bottom-half to
reduce the latency of waking up a herd of clients all waiting on the
same request.
v12: Rearrange missed-interrupt fault injection so that it works with
igt/drv_missed_irq_hang
v13: Rename intel_breadcrumb and friends to intel_wait in preparation
for signal handling.
v14: RCU commentary, assert_spin_locked
v15: Hide BUG_ON behind the compiler; report on gem_latency findings.
v16: Sort seqno-groups by priority so that first-waiter has the highest
task priority (and so avoid priority inversion).
v17: Add waiters to post-mortem GPU hang state.
v18: Return early for a completed wait after acquiring the spinlock.
Avoids adding ourselves to the tree if the is already complete, and
skips the awkward question of why we don't do completion wakeups for
waits earlier than or equal to ourselves.
v19: Prepare for init_breadcrumbs to fail. Later patches may want to
allocate during init, so be prepared to propagate back the error code.

Testcase: igt/gem_concurrent_blit
Testcase: igt/benchmarks/gem_latency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:43 +01:00
Chris Wilson
1f15b76f1e drm/i915: Separate GPU hang waitqueue from advance
Currently __i915_wait_request uses a per-engine wait_queue_t for the dual
purpose of waking after the GPU advances or for waking after an error.
In the future, we may add even more wake sources and require greater
separation, but for now we can conceptually simplify wakeups by separating
the two sources. In particular, this allows us to use different wait-queues
(e.g. one on the engine advancement, a global one for errors and one on
each requests) without any hassle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-5-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:31 +01:00
Chris Wilson
26a02b8fc3 drm/i915: Make queueing the hangcheck work inline
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-4-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:30 +01:00
Chris Wilson
7774002586 drm/i915: Remove the dedicated hangcheck workqueue
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.

v2: Use the system_long_wq as we may wish to capture the error state
after detecting the hang - which may take a bit of time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-3-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:29 +01:00
Chris Wilson
05535726d3 drm/i915: Delay queuing hangcheck to wait-request
We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. However, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the wait-request by ensuring that the hangchecker is
indeed running before we sleep indefinitely (and thereby ensuring that
we never actually sleep forever waiting for a dead GPU).

As pointed out by Tvrtko, it is possible for a GPU hang to go unnoticed
for as long as nobody is waiting for the GPU. Though this rare, during
that time we may be consuming more power than if we had promptly
recovered, and in the most extreme case we may exhaust all memory before
forcing the hangcheck. Something to be wary off in future.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-2-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:28 +01:00
Tvrtko Ursulin
14bb2c1179 drm/i915: Fix a buch of kerneldoc warnings
Just a bunch of stale kerneldocs generating warnings when
building the docs. Mostly function parameters so not very
useful but still.

v2: Tidy.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1464958937-23344-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-06-06 13:04:26 +01:00
Sagar Arun Kamble
1800ad255c drm/i915: Update GEN6_PMINTRMSK setup with GuC enabled
On Loading, GuC sets PM interrupts routing (bit 31) and clears ARAT
expired interrupt (bit 9). Host turbo also updates this register
in RPS flows. This patch ensures bit 31 and bit 9 setup by GuC persists.
ARAT timer interrupt is needed in GuC for various features. It also
facilitates halting GuC and hence achieving RC6. PM interrupt routing
will not impact RPS interrupt reception by host as GuC will redirect
them.
This patch fixes igt test pm_rc6_residency that was failing with guc
load/submission enabled. Tested with SKL GuC v6.1 and BXT GuC v5.1 and v8.7.

v2: i915_irq/i915_pm decoupling from intel_guc. (ChrisW)

v3: restructuring the mask update and rebase w.r.t Ville's patch. (ChrisW)

v4: Updating the pm_intr_keep during direct_interrupts_to_guc. (Sagar)

Cc: Chris Harris <chris.harris@intel.com>
Cc: Zhe Wang <zhe1.wang@intel.com>
Cc: Deepak S <deepak.s@intel.com>
Cc: Satyanantha, Rama Gopal M <rama.gopal.m.satyanantha@intel.com>
Cc: Akash Goel <akash.goel@intel.com>
Testcase: igt/pm_rc6_residency
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Tested-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464683307-19475-1-git-send-email-sagar.a.kamble@intel.com
2016-05-31 16:13:47 -07:00
Daniel Vetter
5a21b6650a drm/i915: Revert async unpin and nonblocking atomic commit
This reverts the following patches:

d55dbd06bb drm/i915: Allow nonblocking update of pageflips.
15c86bdb76 drm/i915: Check for unpin correctness.
95c2ccdc82 Reapply "drm/i915: Avoid stalling on pending flips for legacy cursor updates"
a6747b7304 drm/i915: Make unpin async.
03f476e1fc drm/i915: Prepare connectors for nonblocking checks.
2099deffef drm/i915: Pass atomic states to fbc update functions.
ee7171af72 drm/i915: Remove reset_counter from intel_crtc.
2ee004f7c5 drm/i915: Remove queue_flip pointer.
b8d2afae55 drm/i915: Remove use_mmio_flip kernel parameter.
8dd634d922 drm/i915: Remove cs based page flip support.
143f73b3bf drm/i915: Rework intel_crtc_page_flip to be almost atomic, v3.
84fc494b64 drm/i915: Add the exclusive fence to plane_state.
6885843ae1 drm/i915: Convert flip_work to a list.
aa420ddd8e drm/i915: Allow mmio updates on all platforms, v2.
afee4d8707 Revert "drm/i915: Avoid stalling on pending flips for legacy cursor updates"

"drm/i915: Allow nonblocking update of pageflips" should have been
split up, misses a proper commit message and seems to cause issues in
the legacy page_flip path as demonstrated by kms_flip.

"drm/i915: Make unpin async" doesn't handle the unthrottled cursor
updates correctly, leading to an apparent pin count leak. This is
caught by the WARN_ON in i915_gem_object_do_pin which screams if we
have more than DRM_I915_GEM_OBJECT_MAX_PIN_COUNT pins.

Unfortuantely we can't just revert these two because this patch series
came with a built-in bisect breakage in the form of temporarily
removing the unthrottled cursor update hack for legacy cursor ioctl.
Therefore there's no other option than to revert the entire pile :(

There's one tiny conflict in intel_drv.h due to other patches, nothing
serious.

Normally I'd wait a bit longer with doing a maintainer revert, but
since the minimal set of patches we need to revert (due to the bisect
breakage) is so big, time is running out fast. And very soon
(especially after a few attempts at fixing issues) it'll be really
hard to revert things cleanly.

Lessons learned:
- Not a good idea to rush the review (done by someone fairly new to
  the area) and not make sure domain experts had a chance to read it.

- Patches should be properly split up. I only looked at the two
  patches that should be reverted in detail, but both look like the
  mix up different things in one patch.

- Patches really should have proper commit messages. Especially when
  doing more than one thing, and especially when touching critical and
  tricky core code.

- Building a patch series and r-b stamping it when it has a built-in
  bisect breakage is not a good idea.

- I also think we need to stop building up technical debt by
  postponing atomic igt testcases even longer. I think it's clear that
  there's enough corner cases in this beast that we really need to
  have the testcases _before_ the next step lands.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-05-25 09:33:04 +02:00
Ville Syrjälä
11825b0dba drm/i915: Enable GSE interrupt on BDW+
We've never actually enabled or unmasked the GSE interrupt on BDW+,
even though the interrupt handler was always prepared for it.
Let's enable it and see what happens.

Credit to Mark Kettenis who fixed this in the OpenBSD fork of the
driver. He reports that it fixed the "ACPI _BCM/_BCQ-based
brightness mechanism on a MacBookPro12,1 and a 3rd gen Lenovo X1
Carbon" for them.

Mark says:
"FWIW, this *is* needed if you want ACPI-based backlight control to
 work.  On Linux you probably don't notice, since "hardware" backlight
 control is preferred over "firmware" or "platform" backlight control.

 It would help me if this did land in the Linux tree though, as it will
 make future imports of the i915 driver into OpenBSD easier."

So even though we don't really need this, let's put it in to help Mark
with future porting efforts. Should be harmless to have it enabled in
any case.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
References: http://lists.freedesktop.org/archives/intel-gfx/2015-December/081799.html
Reported-by: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: Mark Kettenis <mark.kettenis@xs4all.nl>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463649283-28698-1-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
2016-05-23 17:53:52 +03:00
Maarten Lankhorst
8dd634d922 drm/i915: Remove cs based page flip support.
With mmio flips now available on all platforms it's time to remove
support for cs flips.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-13-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:38:15 +02:00
Maarten Lankhorst
51cbaf010f drm/i915: Unify unpin_work and mmio_work into flip_work, v2.
Rename intel_unpin_work to intel_flip_work and use it for mmio flips
and unpinning. Use flip_queued_req to hold the wait request in the
mmio case, and the vblank counter from intel_crtc_get_vblank_counter.

MMIO flips get their own path through intel_finish_page_flip_mmio,
handled on vblank. CS page flips go through *_cs.

Changes since v1:
- Clean up destinction between MMIO and CS flips.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-7-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:37:17 +02:00
Maarten Lankhorst
5251f04e0c drm/i915: Remove intel_prepare_page_flip, v3.
Instead of calling prepare_flip right before calling finish_page_flip
do everything from prepare_page_flip in finish_page_flip.

Putting prepare and finish page_flip in a single step removes the need
for INTEL_FLIP_COMPLETE, so it can be removed. This simplifies the code
slightly.

Changes since v1:
- Invert if case to simplify code.
- Add missing barrier.
- Reword commit message.
Changes since v2:
- intel_page_flip_plane is removed.
- work->pending is turned into a bool.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-5-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:36:59 +02:00
Maarten Lankhorst
ef58319d3f drm/i915: Remove intel_finish_page_flip_plane.
This function is duplicated with intel_finish_page_flip,
and is only ever used from planes that could use the
other function anyway.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463490484-19540-4-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
2016-05-19 14:36:48 +02:00
Tvrtko Ursulin
7e22dbbbae drm/i915: Replace "INTEL_INFO->gen == x" checks with IS_GENx
This way optimization from a previous patch works even better.

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
2016-05-11 12:27:27 +01:00
Chris Wilson
dc97997a21 drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c
Pass drm_i915_private to the uncore init/fini routines and their
subservients as it is their native type.

   text    data     bss     dec     hex filename
6309978 3578778  696320 10585076         a183f4 vmlinux
6309530 3578778  696320 10584628         a18234 vmlinux

a modest 400 bytes of saving, but 60 lines of code deleted!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk
2016-05-10 17:20:20 +01:00
Chris Wilson
c033666a94 drm/i915: Store a i915 backpointer from engine, and use it
text	   data	    bss	    dec	    hex	filename
6309351	3578714	 696320	10584385	 a18141	vmlinux
6308391	3578714	 696320	10583425	 a17d81	vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

   text	   data	    bss	    dec	    hex	filename
6304579	3578778	 696320	10579677	 a16edd	vmlinux
6303427	3578778	 696320	10578525	 a16a5d	vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
2016-05-09 13:41:24 +01:00
Tvrtko Ursulin
91d14251bb drm/i915: Small display interrupt handlers tidy
I have noticed some of our interrupt handlers use both dev and
dev_priv while they could get away with only dev_priv in the
huge majority of cases.

Tidying that up had a cascading effect on changing functions
prototypes, so relatively big churn factor, but I think it is
for the better.

For example even where changes cascade out of i915_irq.c, for
functions prefixed with intel_, genX_ or <plat>_, it makes more
sense to take dev_priv directly anyway.

This allows us to eliminate local variables and intermixed usage
of dev and dev_priv where only one is good enough.

End result is shrinkage of both source and the resulting binary.

i915.ko:

 - .text         000b0899
 + .text         000b0619

Or if we look at the Gen8 display irq chain:

 -00000000000006ad t gen8_irq_handler
 +0000000000000663 t gen8_irq_handler
   -0000000000000028 T intel_opregion_asle_intr
   +0000000000000024 T intel_opregion_asle_intr
   -000000000000008c t ilk_hpd_irq_handler
   +000000000000007f t ilk_hpd_irq_handler
   -0000000000000116 T intel_check_page_flip
   +0000000000000112 T intel_check_page_flip
   -000000000000011a T intel_prepare_page_flip
   +0000000000000119 T intel_prepare_page_flip
   -0000000000000014 T intel_finish_page_flip_plane
   +0000000000000013 T intel_finish_page_flip_plane
   -0000000000000053 t hsw_pipe_crc_irq_handler
   +000000000000004c t hsw_pipe_crc_irq_handler
   -000000000000022e t cpt_irq_handler
   +0000000000000213 t cpt_irq_handler

So small shrinkage but it is all fast paths so doesn't harm.

Situation is similar in other interrupt handlers as well.

v2: Tidy intel_queue_rps_boost_for_request as well. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-09 13:38:16 +01:00
Tvrtko Ursulin
98735739cf drm/i915/gen8+: Do not enable DPF interrupt since the handler does not exist
Looks like DPF was not implemented for gen8+ but the IER and IMR
are still enabled on initialization.

Since there is no code to handle this interrupt, gate the irq
enablement behind HAS_L3_DPF in case the feature gets enabled
in the future.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2016-04-20 09:59:16 +01:00
Ville Syrjälä
e30e251acc drm/i915: Split gen8_gt_irq_handler into ack+handle
As we did on VLV, split the gt irq handling to ack and handler phases on
CHV. Leave the BDW+ codepath mostly intact for now.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-13-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:46:00 +03:00
Ville Syrjälä
261e40b8b7 drm/i915: Eliminate passing dev+dev_priv to {snb,ilk}_gt_irq_handler()
It looks silly to pass both dev and dev_priv to the snb/ilk gt irq
handlers. Just pass dev_priv.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-12-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:49 +03:00
Ville Syrjälä
528948745f drm/i915: Move gt/pm irq handling out from irq disabled section on VLV
No need to actually handle the GT/PM interrupt while we have interrupt
sources disabled. Move the actual processing to happen after we've
restored VLV_IER and master interrupt enable.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-11-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:45 +03:00
Ville Syrjälä
2ecb8ca4a0 drm/i915: Split VLV/CVH PIPESTAT handling into ack+handler
Minimize the amount of stuff we do with interrupt sources disabled by
splitting the PIPESTAT irq handling into ack+handler phases.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-10-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:42 +03:00
Ville Syrjälä
1ae3c34c09 drm/i915: Split PORT_HOTPLUG_STAT ack out from i9xx_hpd_irq_handler()
Split the VLV/CHV hoplug irq handling to ack and handler phases. This
way we can move the actual irq handling outside the section where
we have disabled the interrupt sources.

For now, we leave things as is for pre-VLV GMCH platforms, but
eventually they could get the same treatment.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:37 +03:00
Ville Syrjälä
6e814800a2 drm/i915: Move variables to narrower scope in VLV/CHV irq handlers
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-8-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:33 +03:00
Ville Syrjälä
1e1cace942 drm/i915: Eliminate loop from VLV irq handler
Now that we've dealt with the races in clearing IIR bits via VLV_IER
and the master interrupt enable, we can go ahead aliminate the loop
from the VLV interrupt handler.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:27 +03:00
Ville Syrjälä
a5e485a95c drm/i915: Clear VLV_IER around irq processing
On VLV/CHV the master interrupt enable bit only affects GT/PM
interrupts. Display interrupts are not affected by the master
irq control.

Also it seems that the CPU interrupt will only be generated when
the combined result of all GT/PM/display interrupts has a 0->1
edge. We already use the master interrupt enable bit to make sure
GT/PM interrupt can generate such an edge if we don't end up clearing
all IIR bits. We must do the same for display interrupts, and for
that we can simply clear out VLV_IER, and restore after we've acked
all the interrupts we are about to process.

So with both master interrupt enable and VLV_IER cleared out, we will
guarantee that there will be a 0->1 edge if any IIR bits are still set
at the end, and thus another CPU interrupt will be generated.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 579de73b04 ("drm/i915: Exit cherryview_irq_handler() after one pass")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:22 +03:00
Ville Syrjälä
4a0a0202b0 drm/i915: Clear VLV_MASTER_IER around irq processing
Like on CHV, let's clear out the master irq enable bit when we ack
GT/PM interrupts. This will allow GT/PM interrupts to re-raise the
CPU interrupt if we fail to clear all the bits from the IIR(s).

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-5-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:19 +03:00
Ville Syrjälä
7ce4d1f273 drm/i915: Clear VLV_IIR after PIPESTAT
On VLV/CHV VLV_IIR is not double double buffered, and it doesn't detect
edges from PIPESTAT & co. like it does on gen4. Instead it just
directly latches the level from PIPESTAT & co. That means we must clear
VLV_IIR after PIPESTAT & co. or else we'll get a spurious bit in VLV_IIR
every single time.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:15 +03:00
Ville Syrjälä
34c7b8a7b8 drm/i915: Set up VLV_MASTER_IER consistently
We're lacking VLV_MASTER_IER setup from valleyview_irq_preinstall(), so
add it there. Also cargo cult in some POSTING_READ()s to match the other
platforms.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:11 +03:00
Ville Syrjälä
e5328c43d4 drm/i915: Use GEN8_MASTER_IRQ_CONTROL consistently
Use GEN8_MASTER_IRQ_CONTROL instead of DE_MASTER_IRQ_CONTROL or
MASTER_INTERRUPT_ENABLE with the GEN8_MASTER_IRQ register. They're
all bit 31 so there's no actual bug here, but let's be consistent
which name we use for the bit.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460571598-24452-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 14:45:05 +03:00
Chris Wilson
d98c52cf4f drm/i915: Tighten reset_counter for reset status
In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.

The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.

PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-6-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Chris Wilson
c19ae989b0 drm/i915: Hide the atomic_read(reset_counter) behind a helper
This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.

v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1460565315-7748-4-git-send-email-chris@chris-wilson.co.uk
2016-04-14 10:45:40 +01:00
Ville Syrjälä
71b8b41d5b drm/i915: Move DPINVGTT setup to vlv_display_irq_reset()
DPINVGTT lives inside the disp2d power well so we can't frob it unless
we know the power well is active. Let's this stuff into
vlv_display_irq_reset() which is only called at the right times so that
we don't get unclaimed register access errors.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-10-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:09:21 +03:00
Ville Syrjälä
6b7eafc1b4 drm/i915: Warn if irq_mask isn't ~0 during vlv/cvh display irq postinstall
We expect vlv_display_irq_reset() to have been called prior to
vlv_display_irq_postinstall() so let's WARN if that isn't the case.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-8-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:08:51 +03:00
Ville Syrjälä
9ab981f22b drm/i915: Use GEN5_IRQ_INIT() in vlv_display_irq_postinstall()
Replace the hand rolled IMR/IER setup in vlv_display_irq_postinstall()
with GEN5_IRQ_INIT(). Also rename the iir_mask to enable_mask to avoid
consusion since we no longer deal with IIR here.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:08:32 +03:00
Ville Syrjälä
d6c6980358 drm/i915: Clear display interrupt before enabling when turning on the power well
For a bit of extra paranoia make sure the display irqs are all cleared
before we enabled them when turning on the power well. This should
really be the case already since the power well was off which resets
everything.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-6-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:38 +03:00
Ville Syrjälä
8bb613068a drm/i915: Move vlv/chv display irq code to a more logical place
Reshuffle the code a bit to move the vlv/chv display irq functions away
from the main irq hooks, next to the other sub (de,gt,etc.) hooks.

v2: Rebased due to changes in vlv_display_irq_reset()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476604-2035-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:07:24 +03:00
Ville Syrjälä
9918271efc drm/i915: Skip display irq setup if display irqs aren't flagged as enabled
During runtime PM we'll be reinitializing interrupt support from the
ground up. However since the display power well will be off at that
time, well end up with a ton of unclaimed register accesses from the
display irq setup. Since we turned off the power well already before
runtime suspend, we've flagged display irqs as disabled during runtime
PM transitions. So we can just check that flag to see if we should do
skip display irqs during irq setup.

During driver load display irqs will be flagged as enabled since we've
turned on the power well already, however the power well code will have
skipped the display irq setup since irq support as a whole wasn't yet
enabled when the power well was enabled. So we'll want to do the display
irq setup in that case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94164
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:07:13 +03:00
Ville Syrjälä
ad22d10654 drm/i915: Fix up vlv/chv display irq setup
The vlv/chv display irq setup was a bit of mess after I ran out of steam
when working on it last. Fix it up so that we just have a _reset() and
_postinstall() hooks for the display irqs, and use those consistently.

v2: Clear out pipestat_irq_mask[] and PIPE_FIFO_UNDERRUN_STATUS in
    vlv_display_irq_reset() (Imre)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1460476574-1921-1-git-send-email-ville.syrjala@linux.intel.com
2016-04-12 19:06:59 +03:00
Ville Syrjälä
93de68f940 drm/i915: Remove "VLV magic" from irq setup
No clue what this is supposed to achieve. I think it's been there since
the very beginning, so presumably some kind of kludge for very early
silicon. Let's just throw it out.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-04-12 19:06:52 +03:00
Chris Wilson
12471ba87a drm/i915: Harden detection of missed interrupts
Only declare a missed interrupt if we find that the GPU is idle with
waiters and a hangcheck interval has passed in which no new user
interrupts have been raised.

v2: Clear the stuck interrupt marker between successful batches

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-3-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:29 +01:00
Chris Wilson
c04e0f3b4e drm/i915: Separate out the seqno-barrier from engine->get_seqno
In order to simplify future patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

v2: Rename the barrier to engine->irq_seqno_barrier to try and better
reflect that the barrier is only required after the user interrupt before
reading the seqno (to ensure that the seqno update lands in time as we
do not have strict seqno-irq ordering on all platforms).

Reviewed-by: Dave Gordon <david.s.gordon@intel.com> [#v2]

v3: Comments for hangcheck paranoia. Mika wanted to keep the extra
barrier inside the hangcheck, just in case. I can argue that it doesn't
provide a barrier against anything, but the side-effects of applying the
barrier may prevent a false declaration of a hung GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460195877-20520-2-git-send-email-chris@chris-wilson.co.uk
2016-04-09 12:09:05 +01:00
Chris Wilson
cffa781e59 drm/i915: Simplify check for idleness in hangcheck
Having fixed the tracking of the engine's last_submitted_seqno, we can
now rely on it for detecting when the engine is idle (and not have to
touch the requests pointer).

Testcase: igt/gem_exec_whisper
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:45:05 +01:00
Chris Wilson
7c90b7de73 drm/i915: Apply a mb between emitting the request and hangcheck
Seal the request and mark it as pending execution before we submit it to
hardware. We assume that the actual submission cannot fail (that
guarantee is provided by preallocating space in the request for the
submission). As we may inspect this state without holding any locks
during hangcheck we should apply a barrier to ensure that we do
not see a more recent value in the HWS than we are tracking.

Based on a patch by Mika Kuoppala.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460010558-10705-8-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2016-04-08 11:44:48 +01:00
Joonas Lahtinen
2d1fe07340 drm/i915: Do not use {HAS_*, IS_*, INTEL_INFO}(dev_priv->dev)
dev_priv is what the macro works hard to extract, pass it directly.

> sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g'

v2:
- Include all wrapper macros too (Chris)

v3:
- Include sed cmdline (Chris)

v4:
- Break long line
- Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-04-07 14:50:26 +03:00
Shubhangi Shrivastava
d252bf68b7 drm/i915: Set invert bit for hpd based on VBT
This patch sets the invert bit for hpd detection for each port
based on VBT configuration. Since each AOB can be designed to
depend on invert bit or not, it is expected if an AOB requires
invert bit, the user will set respective bit in VBT.

v2: Separated VBT parsing from the rest of the logic. (Jani)

v3: Moved setting invert bit logic to bxt_hpd_irq_setup()
    and changed its logic to avoid looping twice. (Ville)

v4: Changed the logic to mask out the bits first and then
    set them to remove need of temporary variable. (Ville)

v5: Moved defines to existing set of defines for the register
    and added required breaks. (Ville)

Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[Jani: fixed some checkpatch noise, added kernel-doc.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1459420907-11383-2-git-send-email-shubhangi.shrivastava@intel.com
2016-04-06 14:22:48 +03:00
Tvrtko Ursulin
27af5eea54 drm/i915: Move execlists irq handler to a bottom half
Doing a lot of work in the interrupt handler introduces huge
latencies to the system as a whole.

Most dramatic effect can be seen by running an all engine
stress test like igt/gem_exec_nop/all where, when the kernel
config is lean enough, the whole system can be brought into
multi-second periods of complete non-interactivty. That can
look for example like this:

 NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/u8:3:143]
 Modules linked in: [redacted for brevity]
 CPU: 0 PID: 143 Comm: kworker/u8:3 Tainted: G     U       L  4.5.0-160321+ #183
 Hardware name: Intel Corporation Broadwell Client platform/WhiteTip Mountain 1
 Workqueue: i915 gen6_pm_rps_work [i915]
 task: ffff8800aae88000 ti: ffff8800aae90000 task.ti: ffff8800aae90000
 RIP: 0010:[<ffffffff8104a3c2>]  [<ffffffff8104a3c2>] __do_softirq+0x72/0x1d0
 RSP: 0000:ffff88014f403f38  EFLAGS: 00000206
 RAX: ffff8800aae94000 RBX: 0000000000000000 RCX: 00000000000006e0
 RDX: 0000000000000020 RSI: 0000000004208060 RDI: 0000000000215d80
 RBP: ffff88014f403f80 R08: 0000000b1b42c180 R09: 0000000000000022
 R10: 0000000000000004 R11: 00000000ffffffff R12: 000000000000a030
 R13: 0000000000000082 R14: ffff8800aa4d0080 R15: 0000000000000082
 FS:  0000000000000000(0000) GS:ffff88014f400000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007fa53b90c000 CR3: 0000000001a0a000 CR4: 00000000001406f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Stack:
  042080601b33869f ffff8800aae94000 00000000fffc2678 ffff88010000000a
  0000000000000000 000000000000a030 0000000000005302 ffff8800aa4d0080
  0000000000000206 ffff88014f403f90 ffffffff8104a716 ffff88014f403fa8
 Call Trace:
  <IRQ>
  [<ffffffff8104a716>] irq_exit+0x86/0x90
  [<ffffffff81031e7d>] smp_apic_timer_interrupt+0x3d/0x50
  [<ffffffff814f3eac>] apic_timer_interrupt+0x7c/0x90
  <EOI>
  [<ffffffffa01c5b40>] ? gen8_write64+0x1a0/0x1a0 [i915]
  [<ffffffff814f2b39>] ? _raw_spin_unlock_irqrestore+0x9/0x20
  [<ffffffffa01c5c44>] gen8_write32+0x104/0x1a0 [i915]
  [<ffffffff8132c6a2>] ? n_tty_receive_buf_common+0x372/0xae0
  [<ffffffffa017cc9e>] gen6_set_rps_thresholds+0x1be/0x330 [i915]
  [<ffffffffa017eaf0>] gen6_set_rps+0x70/0x200 [i915]
  [<ffffffffa0185375>] intel_set_rps+0x25/0x30 [i915]
  [<ffffffffa01768fd>] gen6_pm_rps_work+0x10d/0x2e0 [i915]
  [<ffffffff81063852>] ? finish_task_switch+0x72/0x1c0
  [<ffffffff8105ab29>] process_one_work+0x139/0x350
  [<ffffffff8105b186>] worker_thread+0x126/0x490
  [<ffffffff8105b060>] ? rescuer_thread+0x320/0x320
  [<ffffffff8105fa64>] kthread+0xc4/0xe0
  [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170
  [<ffffffff814f351f>] ret_from_fork+0x3f/0x70
  [<ffffffff8105f9a0>] ? kthread_create_on_node+0x170/0x170

I could not explain, or find a code path, which would explain
a +20 second lockup, but from some instrumentation it was
apparent the interrupts off proportion of time was between
10-25% under heavy load which is quite bad.

When a interrupt "cliff" is reached, which was >~320k irq/s on
my machine, the whole system goes into a terrible state of the
above described multi-second lockups.

By moving the GT interrupt handling to a tasklet in a most
simple way, the problem above disappears completely.

Testing the effect on sytem-wide latencies using
igt/gem_syslatency shows the following before this patch:

gem_syslatency: cycles=1532739, latency mean=416531.829us max=2499237us
gem_syslatency: cycles=1839434, latency mean=1458099.157us max=4998944us
gem_syslatency: cycles=1432570, latency mean=2688.451us max=1201185us
gem_syslatency: cycles=1533543, latency mean=416520.499us max=2498886us

This shows that the unrelated process is experiencing huge
delays in its wake-up latency. After the patch the results
look like this:

gem_syslatency: cycles=808907, latency mean=53.133us max=1640us
gem_syslatency: cycles=862154, latency mean=62.778us max=2117us
gem_syslatency: cycles=856039, latency mean=58.079us max=2123us
gem_syslatency: cycles=841683, latency mean=56.914us max=1667us

Showing a huge improvement in the unrelated process wake-up
latency. It also shows an approximate halving in the number
of total empty batches submitted during the test. This may
not be worrying since the test puts the driver under
a very unrealistic load with ncpu threads doing empty batch
submission to all GPU engines each.

Another benefit compared to the hard-irq handling is that now
work on all engines can be dispatched in parallel since we can
have up to number of CPUs active tasklets. (While previously
a single hard-irq would serially dispatch on one engine after
another.)

More interesting scenario with regards to throughput is
"gem_latency -n 100" which  shows 25% better throughput and
CPU usage, and 14% better dispatch latencies.

I did not find any gains or regressions with Synmark2 or
GLbench under light testing. More benchmarking is certainly
required.

v2:
   * execlists_lock should be taken as spin_lock_bh when
     queuing work from userspace now. (Chris Wilson)
   * uncore.lock must be taken with spin_lock_irq when
     submitting requests since that now runs from either
     softirq or process context.

v3:
   * Expanded commit message with more testing data;
   * converted missed locking sites to _bh;
   * added execlist_lock comment. (Chris Wilson)

v4:
   * Mention dispatch parallelism in commit. (Chris Wilson)
   * Do not hold uncore.lock over MMIO reads since the block
     is already serialised per-engine via the tasklet itself.
     (Chris Wilson)
   * intel_lrc_irq_handler should be static. (Chris Wilson)
   * Cancel/sync the tasklet on GPU reset. (Chris Wilson)
   * Document and WARN that tasklet cannot be active/pending
     on engine cleanup. (Chris Wilson/Imre Deak)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Testcase: igt/gem_exec_nop/all
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94350
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1459768316-6670-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-04-04 14:08:52 +01:00
Chris Wilson
579de73b04 drm/i915: Exit cherryview_irq_handler() after one pass
This effectively reverts

commit 8e5fd599eb
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Apr 9 13:28:50 2014 +0300

    drm/i915/chv: Make CHV irq handler loop until all interrupts are consumed

as under continuous execlists load we can saturate the IRQ handler,
destablising the tsc clock and triggering the NMI watchdog to declare a hung
CPU.

[  552.756051] clocksource: timekeeping watchdog on CPU0: Marking clocksource 'tsc' as unstable because the skew is too large:
[  552.756080] clocksource:                       'refined-jiffies' wd_now: 10003b480 wd_last: 10003b28c mask: ffffffff
[  552.756091] clocksource:                       'tsc' cs_now: d55d31aa50 cs_last: d17446166c mask: ffffffffffffffff
[  552.756210] clocksource: Switched to clocksource refined-jiffies
[  575.217870] NMI watchdog: Watchdog detected hard LOCKUP on cpu 1
[  575.217893] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rc7+ #18
[  575.217905] Hardware name:                  /NUC5CPYB, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[  575.217915]  0000000000000000 ffff88027fd05bc0 ffffffff81288c6d 0000000000000000
[  575.217935]  0000000000000001 ffff88027fd05be0 ffffffff810e72d1 0000000000000000
[  575.217951]  ffff88027fd05c80 ffff88027fd05c20 ffffffff81114b60 0000000181015f1e
[  575.217967] Call Trace:
[  575.217973]  <NMI>  [<ffffffff81288c6d>] dump_stack+0x4f/0x72
[  575.217994]  [<ffffffff810e72d1>] watchdog_overflow_callback+0x151/0x160
[  575.218003]  [<ffffffff81114b60>] __perf_event_overflow+0xa0/0x1e0
[  575.218016]  [<ffffffff811154c4>] perf_event_overflow+0x14/0x20
[  575.218028]  [<ffffffff8101d2ca>] intel_pmu_handle_irq+0x1da/0x460
[  575.218042]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218052]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218064]  [<ffffffff81014ae8>] perf_event_nmi_handler+0x28/0x50
[  575.218075]  [<ffffffff81007540>] nmi_handle+0x60/0x130
[  575.218086]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218096]  [<ffffffff810079c0>] do_nmi+0x140/0x470
[  575.218108]  [<ffffffff81559ec7>] end_repeat_nmi+0x1a/0x1e
[  575.218119]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218129]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218139]  [<ffffffff814a8aae>] ? poll_idle+0x3e/0x70
[  575.218148]  <<EOE>>  [<ffffffff814a8353>] cpuidle_enter_state+0xf3/0x2f0
[  575.218164]  [<ffffffff814a8587>] cpuidle_enter+0x17/0x20
[  575.218175]  [<ffffffff810aaa3a>] call_cpuidle+0x2a/0x40
[  575.218185]  [<ffffffff810aade3>] cpu_startup_entry+0x273/0x330
[  575.218196]  [<ffffffff81033a1e>] start_secondary+0x10e/0x130

However, not servicing all available IIR within the handler does hurt the
throughput of pathological nop execbuf by about 20%, with a similar effect
upon the dispatch latency of a series of execbuf.

v2: use do {} while(0) for a smaller patch, and easier to revert again

I have reasonable confidence that we do not miss GT interrupts (as
execlists provides a stress case with a failure mechanism easily
detected by igt), however I have less confidence about all the other
sources of interrupts and worry that may lose a display hotplug
interrupt, for example.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
Testcase: igt/gem_exec_nop/basic # requires NMI watchdog
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Antti Koskipää <antti.koskipaa@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1457946117-6714-1-git-send-email-chris@chris-wilson.co.uk
2016-03-30 14:24:01 +01:00
Dave Gordon
b4ac5afc6b drm/i915: replace for_each_engine()
Having provided for_each_engine_id() for cases where the third (id)
argument is useful, we can now replace all the remaining instances with
a simpler version that takes only two parameters. In many cases, this
also allows the elimination of the local variable used in the iterator
(usually 'i').

v2:
    s/dev_priv/(dev_priv__)/ in body of for_each_engine_masked() [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458757194-17783-2-git-send-email-david.s.gordon@intel.com
2016-03-24 14:34:11 +00:00
Dave Gordon
c3232b1883 drm/i915: introduce for_each_engine_id()
Equivalent to the existing for_each_engine() macro, this will replace
the latter wherever the third argument *is* actually wanted (in most
places, it is not used). The third argument is renamed to emphasise
that it is an engine id (type enum intel_engine_id). All the callers of
the macro that actually need the third argument are updated to use this
version, and the argument (generally 'i') is also updated to be 'id'.
Other callers (where the third argument is unused) are untouched for
now; they will be updated in the next patch.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-24 14:34:06 +00:00
arun.siluvery@linux.intel.com
14b730fcb8 drm/i915/tdr: Prepare error handler to accept mask of hung engines
In preparation for engine reset, the wedged argument of i915_handle_error()
is extended to reflect as a mask of engines that are hung. This is further
passed down to error state capture functions which are also updated.

Engine reset recovery mechanism uses this mask and schedules recovery work
for those particular engines.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458331676-567-3-git-send-email-arun.siluvery@linux.intel.com
2016-03-22 14:12:59 +02:00
Imre Deak
bd39ec5dda drm/i915: Move load time IRQ SW init earlier
Most of the IRQ init is setting up hooks so move that part earlier.
Leave the pm_qos_add_request() call in place.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1458128348-15730-4-git-send-email-imre.deak@intel.com
2016-03-17 15:22:04 +02:00
Tvrtko Ursulin
117897f42c drm/i915: More renaming of rings to engines
This time using only sed and a few by hand.

v2: Rename also intel_ring_id and intel_ring_initialized.
v3: Fixed typo in intel_ring_initialized.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1458126040-33105-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-03-16 15:33:30 +00:00
Tvrtko Ursulin
666796da7a drm/i915: More intel_engine_cs renaming
Some trivial ones, first pass done with Coccinelle:

@@
@@
(
- I915_NUM_RINGS
+ I915_NUM_ENGINES
|
- intel_ring_flag
+ intel_engine_flag
|
- for_each_ring
+ for_each_engine
|
- i915_gem_request_get_ring
+ i915_gem_request_get_engine
|
- intel_ring_idle
+ intel_engine_idle
|
- i915_gem_reset_ring_status
+ i915_gem_reset_engine_status
|
- i915_gem_reset_ring_cleanup
+ i915_gem_reset_engine_cleanup
|
- init_ring_lists
+ init_engine_lists
)

But that didn't fully work so I cleaned it up with:

for f in *.[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done
for f in *.[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:24 +00:00
Tvrtko Ursulin
4a570db57c drm/i915: Rename intel_engine_cs struct members
below and a couple manual fixups.

@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs *J;
+ struct intel_engine_cs *engine;
...
}
@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs J;
+ struct intel_engine_cs engine;
...
}
@@
struct drm_i915_private *d;
@@
(
- d->ring
+ d->engine
)
@@
struct i915_execbuffer_params *p;
@@
(
- p->ring
+ p->engine
)
@@
struct intel_ringbuffer *r;
@@
(
- r->ring
+ r->engine
)
@@
struct drm_i915_gem_request *req;
@@
(
- req->ring
+ req->engine
)

v2: Script missed the tracepoint code - fixed up by hand.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:17 +00:00
Tvrtko Ursulin
0bc40be85f drm/i915: Rename intel_engine_cs function parameters
@@
identifier func;
@@
func(..., struct intel_engine_cs *
- ring
+ engine
, ...)
{
<...
- ring
+ engine
...>
}
@@
identifier func;
type T;
@@
T func(..., struct intel_engine_cs *
- ring
+ engine
, ...);

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:10 +00:00
Tvrtko Ursulin
e2f8039147 drm/i915: Rename local struct intel_engine_cs variables
Done by the Coccinelle script below plus a manual
intervention to GEN8_RING_SEMAPHORE_INIT.

@@
expression E;
@@
- struct intel_engine_cs *ring = E;
+ struct intel_engine_cs *engine = E;
<+...
- ring
+ engine
...+>
@@
@@
- struct intel_engine_cs *ring;
+ struct intel_engine_cs *engine;
<+...
- ring
+ engine
...+>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:00 +00:00
Mika Kuoppala
24a65e624b drm/i915/hangcheck: Prevent long walks across full-ppgtt
With full-ppgtt, it takes the GPU an eon to traverse the entire 256PiB
address space, causing a loop to be detected. Under the current scheme,
if ACTHD walks off the end of a batch buffer and into an empty
address space, we "never" detect the hang. If we always increment the
score as the ACTHD is progressing then we will eventually timeout (after
~46.5s (31 * 1.5s) without advancing onto a new batch). To counter act
this, increase the amount we reduce the score for good batches, so that
only a series of almost-bad batches trigger a full reset. DoS detection
suffers slightly but series of long running shader tests will benefit.

Based on a patch from Chris Wilson.

Testcase: igt/drv_hangman/hangcheck-unterminated
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1456930109-21532-1-git-send-email-mika.kuoppala@intel.com
2016-03-04 15:17:14 +02:00
Ville Syrjälä
6831f3e3c6 drm/i915: Add for_each_pipe_masked()
for_each_pipe_masked() can be used to iterate over the pipes
included in the user provided pipe mask. Removes a few lines of
duplicated code.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455907651-16397-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-02-22 19:28:16 +02:00
Ville Syrjälä
aae8ba8444 drm/i915: Make sure pipe interrupts are processed before turning off power well on BDW+
Starting from BDW the DE_PIPE interrupts for pipe B and C belong to the
relevant display power well. So we should make sure we've finished
processing them before turning off the power well.

The pipe interrupts shouldn't really happen at this point anymore since
we've already shut down the planes/pipes/whatnot, but being a bit
paranoid shouldn't hurt.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455907651-16397-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-02-22 19:28:13 +02:00
Ville Syrjälä
1ca993d237 drm/i915: Skip PIPESTAT reads from irq handler on VLV/CHV when power well is down
PIPESTAT registers live in the display power well on VLV/CHV, so we
shouldn't access them when things are powered down. Let's check
whether the display interrupts are on or off before accessing the
PIPESTAT registers.

Another option would be to read the PIPESTAT registers only when
the IIR register indicates that there's a pending pipe event. But
that would mean we might miss even more underrun reports than we
do now, because the underrun status bit lives in PIPESTAT but doesn't
actually generate an interrupt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93738
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455825266-24686-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-02-22 19:28:04 +02:00
Tvrtko Ursulin
f11a0f46a2 drm/i915/gen8: Factor out display interrupt handling
Tidy quite long interrupt service routine by factoring out
the display part.

This simplifies the exit path a little bit, makes the code
a bit more readable, and potentialy makes code reuse in the
future easier.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1452614647-13973-2-git-send-email-tvrtko.ursulin@linux.intel.com
2016-01-13 10:01:53 +00:00
Tvrtko Ursulin
e32192e1ae drm/i915/gen8: Tidy display interrupt processing
One bugfix and a few tidy-ups:

 * Pipe fault logging was broken on Gen9+.
 * Removed some unnecessary local variables.
 * Removed unnecessary initializers.
 * Decreased pipe iir block indentation level.
 * Grouped variable initialization close to use sites.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@cris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1452614647-13973-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-01-13 10:01:35 +00:00
Mika Kuoppala
7571494004 drm/i915: Do one shot unclaimed mmio detection less frequently
We have done unclaimed register access check in normal
(mmio_debug=0) mode once per write. This adds probability
of finding the exact sequence where we did the bad access, but
also adds burden to each write.

As we have mmio_debug available for more fine grained analysis,
give up accuracy of detecting correct spot at the first occurrence
by doing the one shot detection and arming of mmio_debug in hangcheck
and in modeset. This removes the write path performance burden.

v2: Remove gratuitous DRM_DEBUG and return value, comments (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450250808-14864-1-git-send-email-mika.kuoppala@intel.com
2016-01-08 13:13:50 +02:00
Mika Kuoppala
fc97618bf3 drm/i915: Introduce intel_uncore_unclaimed_mmio
Currently interrupt code is the only place checking
for the unclaimed register access prior to actual register
macros using the same functionality. Rename the function
and make it return bool so that the possible error message
context is clear in the caller side. The motivation is to allow
usage of unclaimed detection on arbitrary places.

v2: rebase, s/access/mmio, s/dev/dev_priv

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1450189512-30360-2-git-send-email-mika.kuoppala@intel.com
2016-01-08 13:09:18 +02:00
Mika Kuoppala
61642ff035 drm/i915: Inspect subunit states on hangcheck
If head seems stuck and engine in question is rcs,
inspect subunit state transitions from undone to done,
before deciding that this really is a hang instead of limited
progress. Only account the transitions of subunits from
undone to done once, to prevent unstable subunit states
to keep us falsely active.

As this adds one extra steps to hangcheck heuristics,
before hang is declared, it adds 1500ms to to detect hang
for render ring to a total of 7500ms. We could sample
the subunit states on first head stuck condition but
decide not to do so only in order to mimic old behaviour. This
way the check order of promotion from seqno > atchd > instdone
is consistently done.

v2: Deal with unstable done states (Arun)
    Clear instdone progress on head and seqno movement (Chris)
    Report raw and accumulated instdone's in in debugfs (Chris)
    Return HANGCHECK_ACTIVE on undone->done

References: https://bugs.freedesktop.org/show_bug.cgi?id=93029
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448985372-19535-1-git-send-email-mika.kuoppala@intel.com
2016-01-08 13:06:04 +02:00
Jani Nikula
2dfb0b816d drm/i915: shut up gen8+ SDE irq dmesg noise, again
We still keep getting

[    4.249930] [drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (SDE)!

This reverts

commit 820da7ae46
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Wed Nov 25 16:47:23 2015 +0200

    Revert "drm/i915: shut up gen8+ SDE irq dmesg noise"

which in itself is a revert, so this is just doing

commit 97e5ed1111
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Fri Oct 23 10:56:12 2015 +0200

    drm/i915: shut up gen8+ SDE irq dmesg noise

all over again. I'll stop pretending I understand what's going on like I
did when I thought I'd fixed this for good in

commit 6a39d7c986
Author: Jani Nikula <jani.nikula@intel.com>
Date:   Wed Nov 25 16:47:22 2015 +0200

    drm/i915: fix the SDE irq dmesg warnings properly

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reference: http://mid.gmane.org/20151213124945.GA5715@nuc-i3427.alporthouse.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92084
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: 820da7ae46 ("Revert "drm/i915: shut up gen8+ SDE irq dmesg noise"")
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1452155350-14658-1-git-send-email-jani.nikula@intel.com
2016-01-07 11:47:43 +02:00
Imre Deak
1f814daca4 drm/i915: add support for checking if we hold an RPM reference
Atm, we assert that the device is not suspended until the point when the
device is truly put to a suspended state. This is fine, but we can catch
more problems if we check that RPM refcount is non-zero. After that one
drops to zero we shouldn't access the device any more, even if the actual
device suspend may be delayed. Change assert_rpm_wakelock_held()
accordingly to check for a non-zero RPM refcount in addition to the
current device-not-suspended check.

For the new asserts to work we need to annotate every place explicitly in
the code where we expect that the device is powered. The places where we
only assume this, but may not hold an RPM reference:
- driver load
  We assume the device to be powered until we enable RPM. Make this
  explicit by taking an RPM reference around the load function.
- system and runtime sudpend/resume handlers
  These handlers are called when the RPM reference becomes 0 and know the
  exact point after which the device can get powered off. Disable the
  RPM-reference-held check for their duration.
- the IRQ, hangcheck and RPS work handlers
  These handlers are flushed in the system/runtime suspend handler
  before the device is powered off, so it's guaranteed that they won't
  run while the device is powered off even though they don't hold any
  RPM reference. Disable the RPM-reference-held check for their duration.

In all these cases we still check that the device is not suspended.
These explicit annotations also have the positive side effect of
documenting our assumptions better.

This caught additional WARNs from the atomic modeset path, those should
be fixed separately.

v2:
- remove the redundant HAS_RUNTIME_PM check (moved to patch 1) (Ville)
v3:
- use a new dedicated RPM wakelock refcount to also catch cases where
  our own RPM get/put functions were not called (Chris)
- assert also that the new RPM wakelock refcount is 0 in the RPM
  suspend handler (Chris)
- change the assert error message to be more meaningful (Chris)
- prevent false assert errors and check that the RPM wakelock is 0 in
  the RPM resume handler too
- prevent false assert errors in the hangcheck work too
- add a device not suspended assert check to the hangcheck work
v4:
- rename disable/enable_rpm_asserts to disable/enable_rpm_wakeref_asserts
  and wakelock_count to wakeref_count
- disable the wakeref asserts in the IRQ handlers and RPS work too
- update/clarify commit message
v5:
- mark places we plan to change to use proper RPM refcounting with
  separate DISABLE/ENABLE_RPM_WAKEREF_ASSERTS aliases (Chris)

Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1450227139-13471-1-git-send-email-imre.deak@intel.com
2015-12-17 15:59:44 +02:00
Wayne Boyer
666a45379e drm/i915: Separate cherryview from valleyview
The cherryview device shares many characteristics with the valleyview
device.  When support was added to the driver for cherryview, the
corresponding device info structure included .is_valleyview = 1.
This is not correct and leads to some confusion.

This patch changes .is_valleyview to .is_cherryview in the cherryview
device info structure and simplifies the IS_CHERRYVIEW macro.
Then where appropriate, instances of IS_VALLEYVIEW are replaced with
IS_VALLEYVIEW || IS_CHERRYVIEW or equivalent.

v2: Use IS_VALLEYVIEW || IS_CHERRYVIEW instead of defining a new macro.
    Also add followup patches to fix issues discovered during the first
    review. (Ville)
v3: Fix some style issues and one gen check. Remove CRT related changes
    as CRT is not supported on CHV. (Imre, Ville)
v4: Make a few more optimizations. (Ville)

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Wayne Boyer <wayne.boyer@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1449692975-14803-1-git-send-email-wayne.boyer@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
2015-12-10 11:07:24 +01:00
Ville Syrjälä
81fd874e55 drm/i915: Fix kerneldoc indent fails
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1448461290-12333-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-11-26 18:55:40 +02:00