Commit Graph

1084 Commits

Author SHA1 Message Date
Chris Wilson
1300b4f804 drm/i915: Inline gen6_sanitize_rps_pm_mask()
gen6_sanitize_rps_pm_mask() is small enough that inlining it shrinks the
object code.

v2: Use const markup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170312135426.2216-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-03-13 14:24:58 +00:00
Chris Wilson
655d49ef77 drm/i915: Rename REDIRECT_TO_GUC bit
The REDIRECT_TO_GUC bit is a strange beast as it is a disable bit -
setting the bit in the pm interrupt generation stops the interrupt going
to the guc (not sending it to the guc as the name implies). To help the
reader rename it to DISABLE_REDIRECT_TO_GUC so that we keep the bspec
greppable name without it being as confusing!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170312132745.9618-1-chris@chris-wilson.co.uk
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
2017-03-13 11:16:09 +00:00
Sagar Arun Kamble
1f3b1fd386 drm/i915/guc: Update rps.pm_intrmsk_mbz in guc_interrupts_capture/release
Different state is to be maintained for rps.pm_intrmsk_mbz for GuC and
Execlists. Updating it inside guc_interrupts_* routines as in those
routines GuC load/submission params are sanitized and it should not be set
based on HAS_GUC_SCHED during intel_irq_init.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1489199821-6707-3-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-12 12:59:11 +00:00
Sagar Arun Kamble
5dd0455667 drm/i915: s/pm_intr_keep/pm_intrmsk_mbz
"pm_intr_keep" is not conveying the intent that it is bitmask
of interrupts that must be zero(mbz) in GEN6_PMINTRMSK.
Name it "pm_intrmsk_mbz".

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1489199821-6707-2-git-send-email-sagar.a.kamble@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-03-12 12:59:08 +00:00
Chris Wilson
7c0a16ad7c drm/i915: Defer unmasking RPS interrupts until after making adjustments
To make our adjustments to RPS requires taking a mutex and potentially
sleeping for an unknown duration - until we have completed our
adjustments further RPS interrupts are immaterial (they are based on
stale thresholds) and we can safely ignore them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-3-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-03-10 14:28:44 +00:00
Chris Wilson
569884e3d4 drm/i915: Use max(render, media) for Baytrail busyness calculation
Currently, we sum the render and media cycles (on different engines) to
compute a percentage - but we fail to factor in the duplication into the
threshold calculations. This makes us very eager to upclock!

If we just consider the maximum busy cycles of either counter, we should
have an accurate reflection on whether there are cycles to spare to
handle the workload at this frequency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-2-chris@chris-wilson.co.uk
2017-03-10 14:28:44 +00:00
Chris Wilson
e0e8c7cb6e drm/i915: Stop using RP_DOWN_EI on Baytrail
On Baytrail, we manually calculate busyness over the evaluation interval
to avoid issues with miscaluations with RC6 enabled. However, it turns
out that the DOWN_EI interrupt generator is completely bust - it
operates in two modes, continuous or never. Neither of which are
conducive to good behaviour. Stop unmask the DOWN_EI interrupt and just
compute everything from the UP_EI which does seem to correspond to the
desired interval.

v2: Fixup gen6_rps_pm_mask() as well
v3: Inline vlv_c0_above() to combine the now identical elapsed
calculation for up/down and simplify the threshold testing

Fixes: 43cf3bf084 ("drm/i915: Improved w/a for rps on Baytrail")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-03-10 14:28:39 +00:00
Sagar Arun Kamble
9735b04d0c drm/i915: Initialize pm_intr_keep during intel_irq_init for GuC
Driver needs to ensure that it doesn't mask the PM interrupts, which are
unmasked/needed by GuC firmware. For that, Driver maintains a bitmask of
interrupts to be kept unmasked, pm_intr_keep.

pm_intr_keep was determined across GuC load. GuC gets loaded in different
scenarios and it is not going to change the pm_intr_keep so this patch
moves its setup to intel_irq_init.

This patch fixes incorrect RPS masking leading to UP interrupts triggered
even when at cur_freq=max and inversly for Down interrupts.

Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1488862355-9768-1-git-send-email-sagar.a.kamble@intel.com
2017-03-09 12:32:22 +00:00
Dave Airlie
2e16101780 Merge tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel into drm-next
4 weeks worth of stuff since I was traveling&lazy:

- lspcon improvements (Imre)
- proper atomic state for cdclk handling (Ville)
- gpu reset improvements (Chris)
- lots and lots of polish around fences, requests, waiting and
  everything related all over (both gem and modeset code), from Chris
- atomic by default on gen5+ minus byt/bsw (Maarten did the patch to
  flip the default, really this is a massive joint team effort)
- moar power domains, now 64bit (Ander)
- big pile of in-kernel unit tests for various gem subsystems (Chris),
  including simple mock objects for i915 device and and the ggtt
  manager.
- i915_gpu_info in debugfs, for taking a snapshot of the current gpu
  state. Same thing as i915_error_state, but useful if the kernel didn't
  notice something is stick. From Chris.
- bxt dsi fixes (Umar Shankar)
- bxt w/a updates (Jani)
- no more struct_mutex for gem object unreference (Chris)
- some execlist refactoring (Tvrtko)
- color manager support for glk (Ander)
- improve the power-well sync code to better take over from the
  firmware (Imre)
- gem tracepoint polish (Tvrtko)
- lots of glk fixes all around (Ander)
- ctx switch improvements (Chris)
- glk dsi support&fixes (Deepak M)
- dsi fixes for vlv and clanups, lots of them (Hans de Goede)
- switch to i915.ko types in lots of our internal modeset code (Ander)
- byt/bsw atomic wm update code, yay (Ville)

* tag 'drm-intel-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-intel: (432 commits)
  drm/i915: Update DRIVER_DATE to 20170306
  drm/i915: Don't use enums for hardware engine id
  drm/i915: Split breadcrumbs spinlock into two
  drm/i915: Refactor wakeup of the next breadcrumb waiter
  drm/i915: Take reference for signaling the request from hardirq
  drm/i915: Add FIFO underrun tracepoints
  drm/i915: Add cxsr toggle tracepoint
  drm/i915: Add VLV/CHV watermark/FIFO programming tracepoints
  drm/i915: Add plane update/disable tracepoints
  drm/i915: Kill level 0 wm hack for VLV/CHV
  drm/i915: Workaround VLV/CHV sprite1->sprite0 enable underrun
  drm/i915: Sanitize VLV/CHV watermarks properly
  drm/i915: Only use update_wm_{pre,post} for pre-ilk platforms
  drm/i915: Nuke crtc->wm.cxsr_allowed
  drm/i915: Compute proper intermediate wms for vlv/cvh
  drm/i915: Skip useless watermark/FIFO related work on VLV/CHV when not needed
  drm/i915: Compute vlv/chv wms the atomic way
  drm/i915: Compute VLV/CHV FIFO sizes based on the PM2 watermarks
  drm/i915: Plop vlv/chv fifo sizes into crtc state
  drm/i915: Plop vlv wm state into crtc_state
  ...
2017-03-08 12:41:47 +10:00
Dave Airlie
b558dfd56a Merge tag 'drm-misc-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc into drm-next
First slice of drm-misc-next for 4.12:

Core/subsystem-wide:
- link status core patch from Manasi, for signalling link train fail
  to userspace. I also had the i915 patch in here, but that had a
  small buglet in our CI, so reverted.
- more debugfs_remove removal from Noralf, almost there now (Noralf
  said he'll try to follow up with the stragglers).
- drm todo moved into kerneldoc, for better visibility (see
  Documentation/gpu/todo.rst), lots of starter tasks in there.
- devm_ of helpers + use it in sti (from Ben Gaignard, acked by Rob
  Herring)
- extended framebuffer fbdev support (for fbdev flipping), and vblank
  wait ioctl fbdev support (Maxime Ripard)
- misc small things all over, as usual
- add vblank callbacks to drm_crtc_funcs, plus make lots of good use
  of this to simplify drivers (Shawn Guo)
- new atomic iterator macros to unconfuse old vs. new state

Small drivers:
- vc4 improvements from Eric
- vc4 kerneldocs (Eric)!
- tons of improvements for dw-mipi-dsi in rockchip from John Keeping
  and Chris Zhong.
- MAINTAINERS entries for drivers managed in drm-misc. It's not yet
  official, still an experiment, but definitely not complete fail and
  better to avoid confusion. We kinda screwed that up with drm-misc a
  bit when we started committers last year.
- qxl atomic conversion (Gabriel Krisman)
- bunch of virtual driver polish (qxl, virgl, ...)
- misc tiny patches all over

This is the first time we've done the same merge-window blackout for
drm-misc as we've done for drm-intel for ages, hence why we have a
_lot_ of stuff queued already. But it's still only half of drm-intel
(room to grow!), and the drivers in drm-misc experiment seems to work
at least insofar as that you also get lots of driver updates here
alredy.

* tag 'drm-misc-next-2017-03-06' of git://anongit.freedesktop.org/git/drm-misc: (141 commits)
  drm/vc4: Fix OOPSes from trying to cache a partially constructed BO.
  drm/vc4: Fulfill user BO creation requests from the kernel BO cache.
  Revert "drm/i915: Implement Link Rate fallback on Link training failure"
  drm/fb-helper: implement ioctl FBIO_WAITFORVSYNC
  drm: Update drm_fbdev_cma_init documentation
  drm/rockchip/dsi: add dw-mipi power domain support
  drm/rockchip/dsi: fix insufficient bandwidth of some panel
  dt-bindings: add power domain node for dw-mipi-rockchip
  drm/rockchip/dsi: remove mode_valid function
  drm/rockchip/dsi: dw-mipi: correct the coding style
  drm/rockchip/dsi: dw-mipi: support RK3399 mipi dsi
  dt-bindings: add rk3399 support for dw-mipi-rockchip
  drm/rockchip: dw-mipi-dsi: add reset control
  drm/rockchip: dw-mipi-dsi: support non-burst modes
  drm/rockchip: dw-mipi-dsi: defer probe if panel is not loaded
  drm/rockchip: vop: test for P{H,V}SYNC
  drm/rockchip: dw-mipi-dsi: use positive check for N{H, V}SYNC
  drm/rockchip: dw-mipi-dsi: use specific poll helper
  drm/rockchip: dw-mipi-dsi: improve PLL configuration
  drm/rockchip: dw-mipi-dsi: properly configure PHY timing
  ...
2017-03-07 13:59:53 +10:00
Chris Wilson
61d3dc7080 drm/i915: Split breadcrumbs spinlock into two
As we now take the breadcrumbs spinlock within the interrupt handler, we
wish to minimise its hold time. During the interrupt we do not care
about the state of the full rbtree, only that of the first element, so
we can guard that with a separate lock.

v2: Rename first_wait to irq_wait to make it clearer that it is guarded
by irq_lock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303190824.1330-1-chris@chris-wilson.co.uk
2017-03-03 20:19:13 +00:00
Chris Wilson
24754d751c drm/i915: Take reference for signaling the request from hardirq
Being inside a spinlock signaling that the hardware just completed a
request doesn't prevent a second thread already spotting that the
request is complete, freeing it and reallocating it! The code currently
tries to prevent this using RCU -- but that only prevents the request
from being freed, it doesn't prevent us from reallocating it - that
requires us to take a reference.

[  206.922985] BUG: spinlock already unlocked on CPU#4, gem_exec_parall/7796
[  206.922994]  lock: 0xffff8801c6047120, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
[  206.923000] CPU: 4 PID: 7796 Comm: gem_exec_parall Not tainted 4.10.0-CI-Patchwork_4008+ #1
[  206.923006] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1805 06/20/2016
[  206.923012] Call Trace:
[  206.923014]  <IRQ>
[  206.923019]  dump_stack+0x67/0x92
[  206.923023]  spin_dump+0x73/0xc0
[  206.923027]  do_raw_spin_unlock+0x79/0xb0
[  206.923031]  _raw_spin_unlock_irqrestore+0x27/0x60
[  206.923042]  dma_fence_signal+0x160/0x230
[  206.923060]  notify_ring+0xae/0x2e0 [i915]
[  206.923073]  ? ibx_hpd_irq_handler+0xc0/0xc0 [i915]
[  206.923086]  gen8_gt_irq_handler+0x219/0x290 [i915]
[  206.923100]  gen8_irq_handler+0x8e/0x6b0 [i915]
[  206.923105]  __handle_irq_event_percpu+0x58/0x370
[  206.923109]  handle_irq_event_percpu+0x1e/0x50
[  206.923113]  handle_irq_event+0x34/0x60
[  206.923117]  handle_edge_irq+0xbe/0x150
[  206.923122]  handle_irq+0x15/0x20
[  206.923126]  do_IRQ+0x63/0x130
[  206.923142]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923148]  common_interrupt+0x90/0x90
[  206.923153] RIP: 0010:osq_lock+0x77/0x110
[  206.923157] RSP: 0018:ffffc90001cabaa0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff6e
[  206.923164] RAX: 0000000000000000 RBX: ffff880236d1abc0 RCX: ffff8801ef642fc0
[  206.923169] RDX: ffff8801ef6427c0 RSI: ffffffff81c6e7fd RDI: ffffffff81c7c848
[  206.923175] RBP: ffffc90001cabab8 R08: 00000000692bb19b R09: 08c1493200000000
[  206.923180] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880236cdabc0
[  206.923185] R13: ffff8802207f00b0 R14: ffffffffa00b7cd9 R15: ffff8802207f0070
[  206.923191]  </IRQ>
[  206.923206]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923213]  __mutex_lock+0x649/0x990
[  206.923217]  ? __mutex_lock+0xb0/0x990
[  206.923221]  ? _raw_spin_unlock+0x2c/0x50
[  206.923226]  ? __pm_runtime_resume+0x56/0x80
[  206.923242]  ? i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923249]  mutex_lock_interruptible_nested+0x16/0x20
[  206.923264]  i915_mutex_lock_interruptible+0x39/0x140 [i915]
[  206.923270]  ? __pm_runtime_resume+0x56/0x80
[  206.923285]  i915_gem_do_execbuffer.isra.15+0x442/0x1d10 [i915]
[  206.923291]  ? __lock_acquire+0x449/0x1b50
[  206.923296]  ? __might_fault+0x3e/0x90
[  206.923301]  ? __might_fault+0x87/0x90
[  206.923305]  ? __might_fault+0x3e/0x90
[  206.923320]  i915_gem_execbuffer2+0xb5/0x220 [i915]
[  206.923327]  drm_ioctl+0x200/0x450
[  206.923341]  ? i915_gem_execbuffer+0x330/0x330 [i915]
[  206.923348]  do_vfs_ioctl+0x90/0x6e0
[  206.923352]  ? __fget+0x108/0x200
[  206.923356]  ? expand_files+0x2b0/0x2b0
[  206.923361]  SyS_ioctl+0x3c/0x70
[  206.923365]  entry_SYSCALL_64_fastpath+0x1c/0xb1
[  206.923369] RIP: 0033:0x7fdd75fc6357
[  206.923373] RSP: 002b:00007fdd20e59bf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[  206.923380] RAX: ffffffffffffffda RBX: ffffffff81481ff3 RCX: 00007fdd75fc6357
[  206.923385] RDX: 00007fdd20e59c70 RSI: 0000000040406469 RDI: 0000000000000003
[  206.923390] RBP: ffffc90001cabf88 R08: 0000000000000040 R09: 00000000000003f7
[  206.923396] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[  206.923401] R13: 0000000000000003 R14: 0000000040406469 R15: 0000000001cf9cb0
[  206.923408]  ? __this_cpu_preempt_check+0x13/0x20

Fixes: 56299fb7d9 ("drm/i915: Signal first fence from irq handler if complete")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100051
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170303144557.4815-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
2017-03-03 15:53:09 +00:00
Ville Syrjälä
722595362c drm/i915: Add plane update/disable tracepoints
Add tracepoints for plane programming. The tracepoints will dump
the frame and scanline counters, so this can be used to verify eg. that
the plane gets reprogrammed at the right time with respect to watermark
programming (if we have appropriate tracepoints for that as well).

v2: Rebase due to legacy cursor changes

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302171508.1666-16-ville.syrjala@linux.intel.com
2017-03-03 16:50:10 +02:00
Chris Wilson
675204153e drm/i915: s/assert_spin_locked/lockdep_assert_held/
assert_spin_locked() becomes an unconditionally compiled BUG_ON(),
adding debug code right into the heart of critical routines like
interrupt handlers.

   text	   data	    bss	    dec	    hex
1296480	  19944	   2272	1318696	 141f28	before (lockdep disabled)
1295984	  19944	   2272	1318200	 141d38	after

1336261	  21139	   3208	1360608	 14c2e0	before (lockdep enabled)
1339920	  21139	   3208	1364267	 14d12b	after

Small saving for release; hopefully more instructive in debug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170302132801.599-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-03-02 15:18:55 +00:00
Chris Wilson
67b807a892 drm/i915: Delay disabling the user interrupt for breadcrumbs
A significant cost in setting up a wait is the overhead of enabling the
interrupt. As we disable the interrupt whenever the queue of waiters is
empty, if we are frequently waiting on alternating batches, we end up
re-enabling the interrupt on a frequent basis. We do want to disable the
interrupt during normal operations as under high load it may add several
thousand interrupts/s - we have been known in the past to occupy whole
cores with our interrupt handler after accidentally leaving user
interrupts enabled. As a compromise, leave the interrupt enabled until
the next IRQ, or the system is idle. This gives a small window for a
waiter to keep the interrupt active and not be delayed by having to
re-enable the interrupt.

v2: Restore hangcheck/missed-irq detection for continuations
v3: Be more careful restoring the hangcheck timer after reset
v4: Be more careful restoring the fake irq after reset (if required!)
v5: Redo changes to intel_engine_wakeup()
v6: Factor out __intel_engine_wakeup()
v7: Improve commentary for declaring a missed wakeup

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-4-chris@chris-wilson.co.uk
2017-02-27 21:57:23 +00:00
Chris Wilson
56299fb7d9 drm/i915: Signal first fence from irq handler if complete
As execlists and other non-semaphore multi-engine devices coordinate
between engines using interrupts, we can shave off a few 10s of
microsecond of scheduling latency by doing the fence signaling from the
interrupt as opposed to a RT kthread. (Realistically the delay adds
about 1% to an individual cross-engine workload.) We only signal the
first fence in order to limit the amount of work we move into the
interrupt handler. We also have to remember that our breadcrumbs may be
unordered with respect to the interrupt and so we still require the
waiter process to perform some heavyweight coherency fixups, as well as
traversing the tree of waiters.

v2: No need for early exit in irq handler - it breaks the flow between
patches and prevents the tracepoint
v3: Restore rcu hold across irq signaling of request

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227205850.2828-2-chris@chris-wilson.co.uk
2017-02-27 21:57:20 +00:00
Daniel Vetter
8e22e1b349 Merge airlied/drm-next into drm-misc-next
Backmerge the main pull request to sync up with all the newly landed
drivers. Otherwise we'll have chaos even before 4.12 started in
earnest.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-02-26 21:34:42 +01:00
Linus Torvalds
ef96152e6a Less anger inducing pull request for 4.11
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYr5aeAAoJEAx081l5xIa+ZK4P/RD3XUsduYqziVFCRQ2n0X8r
 +D92F4peTnSeSq7ZcZvprv+fezUGAHbfsWFs8feYCI5quUO6pEQSPwN+wyGazUi0
 4hUVB/K9Iq7U/Bj7Z/SmsU3NuWJnkNqbmvSFvUdqYK9D/kl+Tnllzap2N4cTzjwu
 GZOObz4n85cx94NqC3qw+7/ptL1X2MhXa+z0MzbkKyas84Bko1LwCSHRHsDKUnJc
 IcSpOcYZ6pSRMIsKH4Kd79Go4vWm7djXT9XL3PwDk2NcXXUOuR+cfdHqYchYaM/O
 iD2hvaSywBcflxSAml5x6vlXraoRd91ZZulgOObXtFfnUXdZB81TVq4uv6LU4Bx3
 jLFixUZuk/TJT+W/8N10l7M6yMIFaTpNoNMc5n4IF5RNNyWba4BKnrI+f+lQiOpY
 mmjIaidb0t5BICnJzCD264RhCEXmP0HaDV+iQQV6y6jJRXfd1bgnOXLKP73JekzB
 TsbDshCoE7UO0dJ7n0LFpXSTQDTYzlazoEp14f2kFBxir5/l7r67nUlnDTvUQfuN
 tSRvpN/s0wqvH3o7zhmpHxyJ/ZasPMQjNCFAuUEbx8L5SKXsua0FubIzN4aVpilb
 XvfdFRWM/lkOT/q+8cGI/TcE3YTqEmALmGxdV/akbdNCiCg6aClyCLRE/DZhgmSQ
 UMFjr9wlHl5Qo/OqLKj0
 =Yjfg
 -----END PGP SIGNATURE-----

Merge tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux

Pull drm updates from Dave Airlie:
 "This is the main drm pull request for v4.11.

  Nothing too major, the tinydrm and mmu-less support should make
  writing smaller drivers easier for some of the simpler platforms, and
  there are a bunch of documentation updates.

  Intel grew displayport MST audio support which is hopefully useful to
  people, and FBC is on by default for GEN9+ (so people know where to
  look for regressions). AMDGPU has a lot of fixes that would like new
  firmware files installed for some GPUs.

  Other than that it's pretty scattered all over.

  I may have a follow up pull request as I know BenH has a bunch of AST
  rework and fixes and I'd like to get those in once they've been tested
  by AST, and I've got at least one pull request I'm just trying to get
  the author to fix up.

  Core:
   - drm_mm reworked
   - Connector list locking and iterators
   - Documentation updates
   - Format handling rework
   - MMU-less support for fbdev helpers
   - drm_crtc_from_index helper
   - Core CRC API
   - Remove drm_framebuffer_unregister_private
   - Debugfs cleanup
   - EDID/Infoframe fixes
   - Release callback
   - Tinydrm support (smaller drivers for simple hw)

  panel:
   - Add support for some new simple panels

  i915:
   - FBC by default for gen9+
   - Shared dpll cleanups and docs
   - GEN8 powerdomain cleanup
   - DMC support on GLK
   - DP MST audio support
   - HuC loading support
   - GVT init ordering fixes
   - GVT IOMMU workaround fix

  amdgpu/radeon:
   - Power/clockgating improvements
   - Preliminary SR-IOV support
   - TTM buffer priority and eviction fixes
   - SI DPM quirks removed due to firmware fixes
   - Powerplay improvements
   - VCE/UVD powergating fixes
   - Cleanup SI GFX code to match CI/VI
   - Support for > 2 displays on 3/5 crtc asics
   - SI headless fixes

  nouveau:
   - Rework securre boot code in prep for GP10x secure boot
   - Channel recovery improvements
   - Initial power budget code
   - MMU rework preperation

  vmwgfx:
   - Bunch of fixes and cleanups

  exynos:
   - Runtime PM support for MIC driver
   - Cleanups to use atomic helpers
   - UHD Support for TM2/TM2E boards
   - Trigger mode fix for Rinato board

  etnaviv:
   - Shader performance fix
   - Command stream validator fixes
   - Command buffer suballocator

  rockchip:
   - CDN DisplayPort support
   - IOMMU support for arm64 platform

  imx-drm:
   - Fix i.MX5 TV encoder probing
   - Remove lower fb size limits

  msm:
   - Support for HW cursor on MDP5 devices
   - DSI encoder cleanup
   - GPU DT bindings cleanup

  sti:
   - stih410 cleanups
   - Create fbdev at binding
   - HQVDP fixes
   - Remove stih416 chip functionality
   - DVI/HDMI mode selection fixes
   - FPS statistic reporting

  omapdrm:
   - IRQ code cleanup

  dwi-hdmi bridge:
   - Cleanups and fixes

  adv-bridge:
   - Updates for nexus

  sii8520 bridge:
   - Add interlace mode support
   - Rework HDMI and lots of fixes

  qxl:
   - probing/teardown cleanups

  ZTE drm:
   - HDMI audio via SPDIF interface
   - Video Layer overlay plane support
   - Add TV encoder output device

  atmel-hlcdc:
   - Rework fbdev creation logic

  tegra:
   - OF node fix

  fsl-dcu:
   - Minor fixes

  mali-dp:
   - Assorted fixes

  sunxi:
   - Minor fix"

[ This was the "fixed" pull, that still had build warnings due to people
  not even having build tested the result. I'm not a happy camper

  I've fixed the things I noticed up in this merge.      - Linus ]

* tag 'drm-for-v4.11-less-shouty' of git://people.freedesktop.org/~airlied/linux: (1177 commits)
  lib/Kconfig: make PRIME_NUMBERS not user selectable
  drm/tinydrm: helpers: Properly fix backlight dependency
  drm/tinydrm: mipi-dbi: Fix field width specifier warning
  drm/tinydrm: mipi-dbi: Silence: ‘cmd’ may be used uninitialized
  drm/sti: fix build warnings in sti_drv.c and sti_vtg.c files
  drm/amd/powerplay: fix PSI feature on Polars12
  drm/amdgpu: refuse to reserve io mem for split VRAM buffers
  drm/ttm: fix use-after-free races in vm fault handling
  drm/tinydrm: Add support for Multi-Inno MI0283QT display
  dt-bindings: Add Multi-Inno MI0283QT binding
  dt-bindings: display/panel: Add common rotation property
  of: Add vendor prefix for Multi-Inno
  drm/tinydrm: Add MIPI DBI support
  drm/tinydrm: Add helper functions
  drm: Add DRM support for tiny LCD displays
  drm/amd/amdgpu: post card if there is real hw resetting performed
  drm/nouveau/tmr: provide backtrace when a timeout is hit
  drm/nouveau/pci/g92: Fix rearm
  drm/nouveau/drm/therm/fan: add a fallback if no fan control is specified in the vbios
  drm/nouveau/hwmon: expose power_max and power_crit
  ..
2017-02-23 18:58:18 -08:00
Tvrtko Ursulin
dffabc8f4e drm/i915/tracepoints: Rename i915_gem_request_notify
i915_gem_ring_notify is more appropriate since we do not have
the request information at this point, but it is simply a
signal from the engine that some request has been completed.

v2:
  * Always trace and log if there were any waiters.
  * Rename to intel_engine_notify. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2017-02-21 13:18:19 +00:00
Chris Wilson
2246bea6cf drm/i915: Postpone fake breadcrumb interrupt until real interrupts cease
When the timer expires for checking on interrupt processing, check to
see if any interrupts arrived within the last time period. If real
interrupts are still being delivered, we can be reassured that we
haven't missed the final interrupt as the waiter will still be woken.
Only once all activity ceases, do we have to worry about the waiter
never being woken and so need to install a timer to kick the waiter for
a slow arrival of a seqno.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170217151304.16665-2-chris@chris-wilson.co.uk
2017-02-17 15:30:50 +00:00
Imre Deak
2a57d9cce1 drm/i915/gen9+: Enable hotplug detection early
For LSPCON resume time initialization we need to sample the
corresponding pin's HPD level, but this is only available when HPD
detection is enabled. Currently we enable detection only when enabling
HPD interrupts which is too late, so bring the enabling of detection
earlier.

This is needed by the next patch.

Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485509961-9010-2-git-send-email-imre.deak@intel.com
(cherry picked from commit 7fff8126d9)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2017-02-16 11:59:10 +02:00
Chris Wilson
262fd485ac drm/i915: Only enable hotplug interrupts if the display interrupts are enabled
In order to prevent accessing the hpd registers outside of the display
power wells, we should refrain from writing to the registers before the
display interrupts are enabled.

[    4.740136] WARNING: CPU: 1 PID: 221 at drivers/gpu/drm/i915/intel_uncore.c:795 __unclaimed_reg_debug+0x44/0x50 [i915]
[    4.740155] Unclaimed read from register 0x1e1110
[    4.740168] Modules linked in: i915(+) intel_gtt drm_kms_helper prime_numbers
[    4.740190] CPU: 1 PID: 221 Comm: systemd-udevd Not tainted 4.10.0-rc6+ #384
[    4.740203] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[    4.740220] Call Trace:
[    4.740236]  dump_stack+0x4d/0x6f
[    4.740251]  __warn+0xc1/0xe0
[    4.740265]  warn_slowpath_fmt+0x4a/0x50
[    4.740281]  ? insert_work+0x77/0xc0
[    4.740355]  ? fwtable_write32+0x90/0x130 [i915]
[    4.740431]  __unclaimed_reg_debug+0x44/0x50 [i915]
[    4.740507]  fwtable_read32+0xd8/0x130 [i915]
[    4.740575]  i915_hpd_irq_setup+0xa5/0x100 [i915]
[    4.740649]  intel_hpd_init+0x68/0x80 [i915]
[    4.740716]  i915_driver_load+0xe19/0x1380 [i915]
[    4.740784]  i915_pci_probe+0x32/0x90 [i915]
[    4.740799]  pci_device_probe+0x8b/0xf0
[    4.740815]  driver_probe_device+0x2b6/0x450
[    4.740828]  __driver_attach+0xda/0xe0
[    4.740841]  ? driver_probe_device+0x450/0x450
[    4.740853]  bus_for_each_dev+0x5b/0x90
[    4.740865]  driver_attach+0x19/0x20
[    4.740878]  bus_add_driver+0x166/0x260
[    4.740892]  driver_register+0x5b/0xd0
[    4.740906]  ? 0xffffffffa0166000
[    4.740920]  __pci_register_driver+0x47/0x50
[    4.740985]  i915_init+0x5c/0x5e [i915]
[    4.740999]  do_one_initcall+0x3e/0x160
[    4.741015]  ? __vunmap+0x7c/0xc0
[    4.741029]  ? kmem_cache_alloc+0xcf/0x120
[    4.741045]  do_init_module+0x55/0x1c4
[    4.741060]  load_module+0x1f3f/0x25b0
[    4.741073]  ? __symbol_put+0x40/0x40
[    4.741086]  ? kernel_read_file+0x100/0x190
[    4.741100]  SYSC_finit_module+0xbc/0xf0
[    4.741112]  SyS_finit_module+0x9/0x10
[    4.741125]  entry_SYSCALL_64_fastpath+0x17/0x98
[    4.741135] RIP: 0033:0x7f8559a140f9
[    4.741145] RSP: 002b:00007fff7509a3e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[    4.741161] RAX: ffffffffffffffda RBX: 00007f855aba02d1 RCX: 00007f8559a140f9
[    4.741172] RDX: 0000000000000000 RSI: 000055b6db0914f0 RDI: 0000000000000011
[    4.741183] RBP: 0000000000020000 R08: 0000000000000000 R09: 000000000000000e
[    4.741193] R10: 0000000000000011 R11: 0000000000000246 R12: 000055b6db0854d0
[    4.741204] R13: 000055b6db091150 R14: 0000000000000000 R15: 000055b6db035924

v2: Set dev_priv->display_irqs_enabled to true for all platforms other
than vlv/chv that manually control the display power domain.

Fixes: 19625e85c6 ("drm/i915: Enable polling when we don't have hpd")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97798
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lyude <cpaul@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Hans de Goede <jwrdegoede@fedoraproject.org>
Cc: stable@vger.kernel.org
Link: http://patchwork.freedesktop.org/patch/msgid/20170215131547.5064-1-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2017-02-16 09:56:43 +00:00
Chris Wilson
bd64818dfe drm/i915: Only apply the jump to the "efficient RPS" frequency on startup
Currently we apply the jump to rpe if we are below it and the GPU needs
more power. For some GPUs, the rpe is 75% of the maximum range causing
us to dramatically overshoot low power applications *and* unable to
reach the low frequency that can most efficiently deliver their
workload.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210150348.22146-3-chris@chris-wilson.co.uk
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
2017-02-14 22:35:46 +00:00
Chris Wilson
17136d548e drm/i915: Don't accidentally increase the frequency in handling DOWN rps
If we receive a DOWN_TIMEOUT rps interrupt, we respond by reducing the
GPU clocks significantly. Before we do, double check that the frequency
we pick is actually a decrease.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170210150348.22146-2-chris@chris-wilson.co.uk
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
2017-02-14 22:35:39 +00:00
Lyude
317eaa9508 drm/i915/debugfs: Add i915_hpd_storm_ctl
This adds a file in i915's debugfs directory that allows userspace to
manually control HPD storm detection. This is mainly for hotplugging
tests, where we might want to test HPD storm functionality or disable
storm detection to speed up hotplugging tests without breaking anything.

Changes since v1:
- Make HPD storm interval configurable
- Misc code cleanup

Signed-off-by: Lyude <lyude@redhat.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Tomeu Vizoso <tomeu@tomeuvizoso.net>
2017-02-10 14:04:00 -05:00
Shawn Guo
967dd48417 drm: remove drm_vblank_no_hw_counter assignment from driver code
Core code already makes drm_driver.get_vblank_counter hook optional by
letting drm_vblank_no_hw_counter be the default implementation for the
function hook.  So the drm_vblank_no_hw_counter assignment in the driver
code becomes redundant and can be removed now.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Liviu Dudau <liviu.dudau@arm.com>
Cc: Mali DP Maintainers <malidp@foss.arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Inki Dae <inki.dae@samsung.com>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Xinliang Liu <z.liuxinliang@hisilicon.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: CK Hu <ck.hu@mediatek.com>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Mark Yao <mark.yao@rock-chips.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Jyri Sarha <jsarha@ti.com>
Cc: Eric Anholt <eric@anholt.net>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Alexey Brodkin <abrodkin@synopsys.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1486458995-31018-3-git-send-email-shawnguo@kernel.org
2017-02-07 21:43:55 +01:00
Imre Deak
1a56b1a2db drm/i915/gen5+, pch: Enable hotplug detection early
To be consistent with the recent change to enable hotplug detection
early on GEN9 platforms do the same on all non-GMCH platforms starting
from GEN5. On GMCH platforms enabling detection without interrupts isn't
trivial, since AUX and HPD have a shared interrupt line. It could be
done there too by using a SW interrupt mask, but I punt on that for now.

Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485509961-9010-5-git-send-email-imre.deak@intel.com
2017-02-06 16:33:42 +02:00
Imre Deak
7fff8126d9 drm/i915/gen9+: Enable hotplug detection early
For LSPCON resume time initialization we need to sample the
corresponding pin's HPD level, but this is only available when HPD
detection is enabled. Currently we enable detection only when enabling
HPD interrupts which is too late, so bring the enabling of detection
earlier.

This is needed by the next patch.

Cc: Shashank Sharma <shashank.sharma@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1485509961-9010-2-git-send-email-imre.deak@intel.com
2017-02-06 16:32:32 +02:00
Chris Wilson
9fcee2f77e drm/i915: Report the failure to write to the punit
The write to the punit may fail, so propagate the error code back to its
callers. Of particular interest are the RPS writes, so add appropriate
user error codes and logging.

v2: Add DEBUG for failed frequency changes during RPS.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170126101919.13211-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-01-26 17:09:44 +00:00
Jerome Anand
eef57324d9 drm/i915: setup bridge for HDMI LPE audio driver
Enable support for HDMI LPE audio mode on Baytrail and
Cherrytrail when HDaudio controller is not detected

Setup minimum required resources during i915_driver_load:
1. Create a platform device to share MMIO/IRQ resources
2. Make the platform device child of i915 device for runtime PM.
3. Create IRQ chip to forward HDMI LPE audio irqs.

HDMI LPE audio driver (a standalone sound driver) probes the
LPE audio device and creates a new sound card.

Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Jerome Anand <jerome.anand@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-01-25 14:21:47 +01:00
Chris Wilson
f747026c2b drm/i915: Only run execlist context-switch handler after an interrupt
Mark when we run the execlist tasklet following the interrupt, so we
don't probe a potentially uninitialised register when submitting the
contexts multiple times before the hardware responds.

v2: Use a shared engine->irq_posted
v3: Always use locked bitops to be sure of atomicity wrt to other bits
in the mask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170124152021.26587-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-01-24 15:56:01 +00:00
Chris Wilson
538b257dae drm/i915: Move breadcrumbs irq_posted up a level to engine
In the next patch, we will use the irq_posted technique for another
engine interrupt, rather than use two members for the atomic updates, we
can use two bits of one instead. First, we need to update the
breadcrumbs to use the new common engine->irq_posted.

v2: Use set_bit() rather than __set_bit() to ensure atomicity with
respect to other bits in the mask

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170124151805.26146-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2017-01-24 15:55:36 +00:00
Sagar Arun Kamble
7e79a6836c drm/i915: Set adjustment to zero on Up/Down interrupts if freq is already max/min
When we reach the user's RPS limits, stop requesting an adjustment. Even
though we will clamp the requested frequency later, we rely on interrupt
masking to disable further adjustments in the same direction. Even
though it is unlikely (one scenario is a bug in the driver, another is
careful manipulation through the uAPI) if we keep exponentially
increasing the adjustment value, it will wrap and cause a negative
adjustment.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484884104-28134-2-git-send-email-sagar.a.kamble@intel.com
2017-01-20 09:32:47 +00:00
Michel Thierry
87c390b67b drm/i915: Keep i915_handle_error kerneldoc parameters together
And before the function description.
Tidy up from commit 14bb2c1179 ("drm/i915: Fix a buch of kerneldoc
warnings"), all others kerneldoc blocks look ok.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112041817.1102-2-michel.thierry@intel.com
2017-01-12 16:09:35 +02:00
Tomeu Vizoso
246ee524a2 drm/i915: Put "cooked" vlank counters in frame CRC lines
Use drm_accurate_vblank_count so we have the full 32 bit to represent
the frame counter and userspace has a simpler way of knowing when the
counter wraps around.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110134305.26326-3-tomeu.vizoso@collabora.com
2017-01-10 17:30:56 +01:00
Tomeu Vizoso
8c6b709d96 drm/i915: Use new CRC debugfs API
The core provides now an ABI to userspace for generation of frame CRCs,
so implement the ->set_crc_source() callback and reuse as much code as
possible with the previous ABI implementation.

When handling the pageflip interrupt, we skip 1 or 2 frames depending on
the HW because they contain wrong values. For the legacy ABI for
generating frame CRCs, this was done in userspace but now that we have a
generic ABI it's better if it's not exposed by the kernel.

v2:
    - Leave the legacy implementation in place as the ABI implementation
      in the core is incompatible with it.
v3:
    - Use the "cooked" vblank counter so we have a whole 32 bits.
    - Make sure we don't mess with the state of the legacy CRC capture
      ABI implementation.
v4:
    - Keep use of get_vblank_counter as in the legacy code, will be
      changed in a followup commit.

v5:
    - Skip first frame or two as it's known that they contain wrong
      data.
    - A few fixes suggested by Emil Velikov.

v6:
    - Rework programming of the HW registers to preserve previous
      behavior.

v7:
    - Address whitespace issue.
    - Added a comment on why in the implementation of the new ABI we
      skip the 1st or 2nd frames.

v9:
    - Add stub for intel_crtc_set_crc_source.

v12:
    - Rebased.
    - Remove stub for intel_crtc_set_crc_source and instead set the
      callback to NULL (Jani Nikula).

v15:
    - Rebased.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Robert Foss <robert.foss@collabora.com>

irq
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110134305.26326-2-tomeu.vizoso@collabora.com
2017-01-10 17:30:50 +01:00
Ander Conselvan de Oliveira
cc3f90f063 drm/i915/glk: Reuse broxton code for geminilake
Geminilake is mostly backwards compatible with broxton, so change most
of the IS_BROXTON() checks to IS_GEN9_LP(). Differences between the
platforms will be implemented in follow-up patches.

v2: Don't reuse broxton's path in intel_update_max_cdclk().
    Don't set plane count as in broxton.

v3: Rebase

v4: Include the check intel_bios_is_port_hpd_inverted().
    Commit message.

v5: Leave i915_dmc_info() out; glk's csr version != bxt's. (Rodrigo)

v6: Rebase.

v7: Convert a few mode IS_BROXTON() occurances in pps, ddi, dsi and pll
    code. (Rodrigo)

v8: Squash a couple of DDI patches with more conversions. (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1480667037-11215-2-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-12-02 16:38:56 +02:00
Arkadiusz Hiler
a80bc45ff0 drm/i915/guc: Drop guc2host/host2guc from names
To facilitate code reorganization we are renaming everything that
contains guc2host or host2guc.

host2guc_action() and host2guc_action_response() become guc_send()
and guc_recv() respectively.

Other host2guc_*() functions become simply guc_*().

Other entities are renamed basing on context they appear in:
 - HOST2GUC_ACTIONS_&           become INTEL_GUC_ACTION_*
 - HOST2GUC_{INTERRUPT,TRIGGER} become GUC_SEND_{INTERRUPT,TRIGGER}
 - GUC2HOST_STATUS_*            become INTEL_GUC_STATUS_*
 - GUC2HOST_MSG_*               become INTEL_GUC_RECV_MSG_*
 - action_lock                 becomes send_mutex

v2: drop unnecessary backslashes and use BIT() instead of '<<'
v3: shortened enum names and INTEL_GUC_STATUS_*

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1480096777-12573-3-git-send-email-arkadiusz.hiler@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-25 21:28:57 +00:00
Tvrtko Ursulin
b243f530b2 drm/i915: dev_priv cleanup in i915_irq.c
And a little bit of function prototype changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17 13:56:28 +00:00
Tvrtko Ursulin
4805fe82c0 drm/i915: Further assorted dev_priv cleanups
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Tvrtko Ursulin
56b857a5e3 drm/i915: More assorted dev_priv cleanups
A small selection of macros which can only accept dev_priv from
now on and a resulting trickle of fixups.

v2: Keep original order. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Mika Kuoppala
3ac168a70b drm/i915: Move hangcheck code out from i915_irq.c
Create new file for hangcheck specific code, intel_hangcheck.c,
and move all related code in it.

v2: s/intel_engine_hangcheck/intel_engine (Chris)

No functional changes.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478018583-5816-1-git-send-email-mika.kuoppala@intel.com
2016-11-02 11:59:10 +02:00
Ville Syrjälä
98187836fc drm/i915: Always use intel_get_crtc_for_pipe()
Replace the open coded dev_priv->pipe_to_crtc_mapping[] usage with
intel_get_crtc_for_pipe().

Mostly done with coccinelle, with a few manual tweaks

@@
expression E1, E2;
@@
(
- E1->pipe_to_crtc_mapping[E2]
+ intel_get_crtc_for_pipe(E1, E2)
|
- E1->plane_to_crtc_mapping[E2]
+ intel_get_crtc_for_plane(E1, E2)
)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-12-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Ville Syrjälä
b91eb5cce6 drm/i915: Pass dev_priv to intel_get_crtc_for_pipe()
Unify our approach to things by passing around dev_priv instead of dev.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-11-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Ville Syrjälä
e2af48c66b drm/i915: Store struct intel_crtc * in {pipe,plane}_to_crtc_mapping[]
A lot of users of the {pipe,plane}_to_crtc_mapping[] will end up
casting the result to intel_crtc, so let's just store the intel_crtc
pointer in the first place.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1477946245-14134-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-01 16:40:38 +02:00
Chris Wilson
cb399eabc4 drm/i915: Avoid accessing request->timeline outside of its lifetime
Whilst waiting on a request, we may do so without holding any locks or
any guards beyond a reference to the request. In order to avoid taking
locks within request deallocation, we drop references to its timeline
(via the context and ppgtt) upon retirement. We should avoid chasing
such pointers outside of their control, in particular we inspect the
request->timeline to see if we may restore the RPS waitboost for a
client. If we instead look at the engine->timeline, we will have similar
behaviour on both full-ppgtt and !full-ppgtt systems and reduce the
amount of reward we give towards stalling clients (i.e. only if the
client stalls and the GPU is uncontended does it reclaim its boost).
This restores behaviour back to pre-timelines, whilst fixing:

[  645.078485] BUG: KASAN: use-after-free in i915_gem_object_wait_fence+0x1ee/0x2e0 at addr ffff8802335643a0
[  645.078577] Read of size 4 by task gem_exec_schedu/28408
[  645.078638] CPU: 1 PID: 28408 Comm: gem_exec_schedu Not tainted 4.9.0-rc2+ #64
[  645.078724] Hardware name:                  /        , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[  645.078816]  ffff88022daef9a0 ffffffff8143d059 ffff880235402a80 ffff880233564200
[  645.078998]  ffff88022daef9c8 ffffffff81229c5c ffff88022daefa48 ffff880233564200
[  645.079172]  ffff880235402a80 ffff88022daefa38 ffffffff81229ef0 000000008110a796
[  645.079345] Call Trace:
[  645.079404]  [<ffffffff8143d059>] dump_stack+0x68/0x9f
[  645.079467]  [<ffffffff81229c5c>] kasan_object_err+0x1c/0x70
[  645.079534]  [<ffffffff81229ef0>] kasan_report_error+0x1f0/0x4b0
[  645.079601]  [<ffffffff8122a244>] kasan_report+0x34/0x40
[  645.079676]  [<ffffffff81634f5e>] ? i915_gem_object_wait_fence+0x1ee/0x2e0
[  645.079741]  [<ffffffff81229951>] __asan_load4+0x61/0x80
[  645.079807]  [<ffffffff81634f5e>] i915_gem_object_wait_fence+0x1ee/0x2e0
[  645.079876]  [<ffffffff816364bf>] i915_gem_object_wait+0x19f/0x590
[  645.079944]  [<ffffffff81636320>] ? i915_gem_object_wait_priority+0x500/0x500
[  645.080016]  [<ffffffff8110fb30>] ? debug_show_all_locks+0x1e0/0x1e0
[  645.080084]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[  645.080157]  [<ffffffff8110a796>] ? __lock_is_held+0x46/0xc0
[  645.080226]  [<ffffffff8163bc61>] ? i915_gem_set_domain_ioctl+0x141/0x690
[  645.080296]  [<ffffffff8163bcc2>] i915_gem_set_domain_ioctl+0x1a2/0x690
[  645.080366]  [<ffffffff811f8f85>] ? __might_fault+0x75/0xe0
[  645.080433]  [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
[  645.080508]  [<ffffffff8163bb20>] ? i915_gem_obj_prepare_shmem_write+0x3a0/0x3a0
[  645.080603]  [<ffffffff815a52d0>] ? drm_ioctl_permit+0x120/0x120
[  645.080670]  [<ffffffff8110abdc>] ? check_chain_key+0x14c/0x210
[  645.080738]  [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[  645.080804]  [<ffffffff8120268c>] ? do_mmap+0x47c/0x580
[  645.080871]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[  645.080938]  [<ffffffff812755f0>] ? ioctl_preallocate+0x150/0x150
[  645.081011]  [<ffffffff81108c53>] ? up_write+0x23/0x50
[  645.081078]  [<ffffffff811da567>] ? vm_mmap_pgoff+0x117/0x140
[  645.081145]  [<ffffffff811da450>] ? vma_is_stack_for_current+0x90/0x90
[  645.081214]  [<ffffffff8110d853>] ? mark_held_locks+0x23/0xc0
[  645.082030]  [<ffffffff81288408>] ? __fget+0x168/0x250
[  645.082106]  [<ffffffff819ad517>] ? entry_SYSCALL_64_fastpath+0x5/0xb1
[  645.082176]  [<ffffffff81288592>] ? __fget_light+0xa2/0xc0
[  645.082242]  [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[  645.082309]  [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  645.082374] Object at ffff880233564200, in cache kmalloc-8192 size: 8192
[  645.082431] Allocated:
[  645.082480] PID = 28408
[  645.082535]  [  645.082566] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
[  645.082623]  [  645.082656] [<ffffffff81228b06>] save_stack+0x46/0xd0
[  645.082716]  [  645.082756] [<ffffffff812292fd>] kasan_kmalloc+0xad/0xe0
[  645.082817]  [  645.082848] [<ffffffff81631752>] i915_ppgtt_create+0x52/0x220
[  645.082908]  [  645.082941] [<ffffffff8161db96>] i915_gem_create_context+0x396/0x560
[  645.083027]  [  645.083059] [<ffffffff8161f857>] i915_gem_context_create_ioctl+0x97/0xf0
[  645.083152]  [  645.083183] [<ffffffff815a55f7>] drm_ioctl+0x327/0x640
[  645.083243]  [  645.083274] [<ffffffff81275717>] do_vfs_ioctl+0x127/0xa20
[  645.083334]  [  645.083372] [<ffffffff8127604c>] SyS_ioctl+0x3c/0x70
[  645.083432]  [  645.083464] [<ffffffff819ad52e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[  645.083551] Freed:
[  645.083599] PID = 27629
[  645.083648]  [  645.083676] [<ffffffff8103ae66>] save_stack_trace+0x16/0x20
[  645.083738]  [  645.083770] [<ffffffff81228b06>] save_stack+0x46/0xd0
[  645.083830]  [  645.083862] [<ffffffff81229203>] kasan_slab_free+0x73/0xc0
[  645.083922]  [  645.083961] [<ffffffff812279c9>] kfree+0xa9/0x170
[  645.084021]  [  645.084053] [<ffffffff81629f60>] i915_ppgtt_release+0x100/0x180
[  645.084139]  [  645.084171] [<ffffffff8161d414>] i915_gem_context_free+0x1b4/0x230
[  645.084257]  [  645.084288] [<ffffffff816537b2>] intel_lr_context_unpin+0x192/0x230
[  645.084380]  [  645.084413] [<ffffffff81645250>] i915_gem_request_retire+0x620/0x630
[  645.084500]  [  645.085226] [<ffffffff816473d1>] i915_gem_retire_requests+0x181/0x280
[  645.085313]  [  645.085352] [<ffffffff816352ba>] i915_gem_retire_work_handler+0xca/0xe0
[  645.085440]  [  645.085471] [<ffffffff810c725b>] process_one_work+0x4fb/0x920
[  645.085532]  [  645.085562] [<ffffffff810c770d>] worker_thread+0x8d/0x840
[  645.085622]  [  645.085653] [<ffffffff810d21e5>] kthread+0x185/0x1b0
[  645.085718]  [  645.085750] [<ffffffff819ad7a7>] ret_from_fork+0x27/0x40
[  645.085811] Memory state around the buggy address:
[  645.085869]  ffff880233564280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.085956]  ffff880233564300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086053] >ffff880233564380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086138]                                ^
[  645.086193]  ffff880233564400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  645.086283]  ffff880233564480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

v2: Add a comment to document the hint like nature of
 intel_engine_last_submit()

Fixes: 73cb97010d ("drm/i915: Combine seqno + tracking into a global timeline struct")
Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101100317.11129-1-chris@chris-wilson.co.uk
2016-11-01 10:48:40 +00:00
Chris Wilson
73cb97010d drm/i915: Combine seqno + tracking into a global timeline struct
Our timelines are more than just a seqno. They also provide an ordered
list of requests to be executed. Due to the restriction of handling
individual address spaces, we are limited to a timeline per address
space but we use a fence context per engine within.

Our first step to introducing independent timelines per context (i.e. to
allow each context to have a queue of requests to execute that have a
defined set of dependencies on other requests) is to provide a timeline
abstraction for the global execution queue.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-23-chris@chris-wilson.co.uk
2016-10-28 20:53:51 +01:00
Tvrtko Ursulin
1353ec3833 drm/i915: Correct pipe fault reporting string
Newline somehow ended up in the middle of the line.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1477572512-4030-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-27 15:08:43 +01:00
Akash Goel
5aa1ee4b12 drm/i915: Add stats for GuC log buffer flush interrupts
GuC firmware sends an interrupt to flush the log buffer when it
becomes half full. GuC firmware also tracks how many times the
buffer overflowed.
It would be useful to maintain a statistics of how many flush
interrupts were received and for which type of log buffer,
along with the overflow count of each buffer type.
Augmented i915_log_info debugfs to report back these statistics.

v2:
- Update the logic to detect multiple overflows between the 2
  flush interrupts and also log a message for overflow (Tvrtko)
- Track the number of times there was no free sub buffer to capture
  the GuC log buffer. (Tvrtko)

v3:
- Fix the printf field width for overflow counter, set it to 10 as per the
  max value of u32, which takes 10 digits in decimal form. (Tvrtko)

v4:
- Move the log buffer overflow handling to a new function for better
  readability. (Tvrtko)

Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:23 +01:00
Sagar Arun Kamble
4100b2ab3e drm/i915: Handle log buffer flush interrupt event from GuC
GuC ukernel sends an interrupt to Host to flush the log buffer
and expects Host to correspondingly update the read pointer
information in the state structure, once it has consumed the
log buffer contents by copying them to a file or buffer.
Even if Host couldn't copy the contents, it can still update the
read pointer so that logging state is not disturbed on GuC side.

v2:
- Use a dedicated workqueue for handling flush interrupt. (Tvrtko)
- Reduce the overall log buffer copying time by skipping the copy of
  crash buffer area for regular cases and copying only the state
  structure data in first page.

v3:
 - Create a vmalloc mapping of log buffer. (Chris)
 - Cover the flush acknowledgment under rpm get & put.(Chris)
 - Revert the change of skipping the copy of crash dump area, as
   not really needed, will be covered by subsequent patch.

v4:
 - Destroy the wq under the same condition in which it was created,
   pass dev_piv pointer instead of dev to newly added GuC function,
   add more comments & rename variable for clarity. (Tvrtko)

v5:
- Allocate & destroy the dedicated wq, for handling flush interrupt,
  from the setup/teardown routines of GuC logging. (Chris)
- Validate the log buffer size value retrieved from state structure
  and do some minor cleanup. (Tvrtko)
- Fix error/warnings reported by checkpatch. (Tvrtko)
- Rebase.

v6:
 - Remove the interrupts_enabled check from guc_capture_logs_work, need
   to process that last work item also, queued just before disabling the
   interrupt as log buffer flush interrupt handling is a bit different
   case where GuC is actually expecting an ACK from host, which should be
   provided to keep the logging going.
   Sync against the work will be done by caller disabling the interrupt.
 - Don't sample the log buffer size value from state structure, directly
   use the expected value to move the pointer & do the copy and that cannot
   go wrong (out of bounds) as Driver only allocated the log buffer and the
   relay buffers. Driver should refrain from interpreting the log packet,
   as much possible and let Userspace parser detect the anomaly. (Chris)

v7:
- Use switch statement instead of 'if else' for retrieving the GuC log
  buffer size. (Tvrtko)
- Refactored the log buffer copying function and shortended the name of
  couple of variables for better readability. (Tvrtko)

v8:
- Make the dedicated wq as a high priority one to further reduce the
  turnaround time of handing log buffer flush event from GuC.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Sagar Arun Kamble
26705e2075 drm/i915: Support for GuC interrupts
There are certain types of interrupts which Host can receive from GuC.
GuC ukernel sends an interrupt to Host for certain events, like for
example retrieve/consume the logs generated by ukernel.
This patch adds support to receive interrupts from GuC but currently
enables & partially handles only the interrupt sent by GuC ukernel.
Future patches will add support for handling other interrupt types.

v2:
- Use common low level routines for PM IER/IIR programming (Chris)
- Rename interrupt functions to gen9_xxx from gen8_xxx (Chris)
- Replace disabling of wake ref asserts with rpm get/put (Chris)

v3:
- Update comments for more clarity. (Tvrtko)
- Remove the masking of GuC interrupt, which was kept masked till the
  start of bottom half, its not really needed as there is only a
  single instance of work item & wq is ordered. (Tvrtko)

v4:
- Rebase.
- Rename guc_events to pm_guc_events so as to be indicative of the
  register/control block it is associated with. (Chris)
- Add handling for back to back log buffer flush interrupts.

v5:
- Move the read & clearing of register, containing Guc2Host message
  bits, outside the irq spinlock. (Tvrtko)

v6:
- Move the log buffer flush interrupt related stuff to the following
  patch so as to do only generic bits in this patch. (Tvrtko)
- Rebase.

v7:
- Remove the interrupts_enabled check from gen9_guc_irq_handler, want to
  process that last interrupt also before disabling the interrupt, sync
  against the work queued by irq handler will be done by caller disabling
  the interrupt.

Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Akash Goel
f4e9af4f5a drm/i915: Add low level set of routines for programming PM IER/IIR/IMR register set
So far PM IER/IIR/IMR registers were being used only for Turbo related
interrupts. But interrupts coming from GuC also use the same set.
As a precursor to supporting GuC interrupts, added new low level routines
so as to allow sharing the programming of PM IER/IIR/IMR registers between
Turbo & GuC.
Also similar to PM IMR, maintaining a bitmask for PM IER register, to allow
easy sharing of it between Turbo & GuC without involving a rmw operation.

v2:
- For appropriateness & avoid any ambiguity, rename old functions
  enable/disable pm_irq to mask/unmask pm_irq and rename new functions
  enable/disable pm_interrupts to enable/disable pm_irq. (Tvrtko)
- Use u32 in place of uint32_t. (Tvrtko)

v3:
- Rename the fields pm_irq_mask & pm_ier_mask and do some cleanup. (Chris)
- Rebase.

v4: Fix the inadvertent disabling of User interrupt for VECS ring causing
    failure for certain IGTs.

v5: Use dev_priv with HAS_VEBOX macro. (Tvrtko)

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-10-25 09:34:06 +01:00
Chris Wilson
eaa14c2486 drm/i915: Stop reporting error details in dmesg as well as the error-state
As we already capture all the information from the registers into the
error-state, also dumping that to dmesg just generates noise that upsets
CI and users alike (and doesn't provide us with any more information).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161019125203.28851-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-10-20 13:00:06 +01:00
Tvrtko Ursulin
5db9401983 drm/i915: Make IS_GEN macros only take dev_priv
Saves 1416 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-14 12:23:22 +01:00
Tvrtko Ursulin
55b8f2a76d drm/i915: Make INTEL_GEN only take dev_priv
Saves 968 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
3c9192bc08 drm/i915: Make HAS_L3_DPF only take dev_priv
Saves 472 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
e2d214ae2b drm/i915: Make IS_BROXTON only take dev_priv
Saves 1392 bytes of .rodata strings.

Also change a few function/macro prototypes in i915_gem_gtt.c
from dev to dev_priv where it made more sense to do so.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Mention function prototype changes. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
772c2a519c drm/i915: Make IS_HASWELL only take dev_priv
Saves 2432 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
50a0bc9054 drm/i915: Make INTEL_DEVID only take dev_priv
Saves 4472 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
6e266956a5 drm/i915: Make INTEL_PCH_TYPE & co only take dev_priv
This saves 1872 bytes of .rodata strings.

v2:
 * Rebase.
 * Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Akash Goel
3b3f1650b1 drm/i915: Allocate intel_engine_cs structure only for the enabled engines
With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
	struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
	struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
2016-10-14 09:58:43 +01:00
Chris Wilson
86e83e35d1 drm/i915: Merge duplicate gen4 and vlv/chv enable vblank callbacks
gen4/vlv/chv all use the same bits in pipestat to enable the vblank
interrupt, so they can share the same callbacks to enable/disable.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161007194953.15616-1-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-10-13 21:58:52 +01:00
Chris Wilson
0e70447605 drm/i915: Move common code out of i915_gpu_error.c
In the next patch, I want to conditionally compile i915_gpu_error.c and
that requires moving the functions used by debug out of
i915_gpu_error.c!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-1-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00
Chris Wilson
348b9b1192 drm/i915: Use correct index for backtracking HUNG semaphores
When decoding the semaphores inside hangcheck, we need to use the hw-id
and not the local array index.

Fixes: de1add3605 ("drm/i915: Decouple execbuf uAPI ...")
Testcase: igt/gem_exec_whisper/hang # gen6-7
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161003124516.12388-3-chris@chris-wilson.co.uk
2016-10-03 20:59:55 +01:00
Chris Wilson
f2a91d1a6f drm/i915: Restore current RPS state after reset
Following commit 821ed7df6e ("drm/i915: Update reset path to fix
incomplete requests") we no longer mark the context as lost on reset as
we keep the requests (and contexts) alive. However, RPS remains reset
and we need to restore the current state to match the in-flight
requests.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97824
Fixes: 821ed7df6e ("drm/i915: Update reset path to fix incomplete requests")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-1-chris@chris-wilson.co.uk
2016-09-21 16:52:39 +01:00
Ben Widawsky
f9e6137280 drm/i915: Try to print INSTDONE bits for all slice/subslice
v2: (Imre)
- Access only subslices that are known to exist.
- Reset explicitly the MCR selector to slice/sub-slice ID 0 after the
  readout.
- Use the subslice INSTDONE bits for the hangcheck/subunits-stuck
  detection too.
- Take the uncore lock for the MCR-select/subslice-readout sequence.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1474379673-28326-2-git-send-email-imre.deak@intel.com
2016-09-21 15:33:29 +03:00
Ben Widawsky
d636951ec0 drm/i915: Cleanup instdone collection
Consolidate the instdone logic so we can get a bit fancier. This patch also
removes the duplicated print of INSTDONE[0].

v2: (Imre)
- Rebased on top of hangcheck INSTDONE changes.
- Move all INSTDONE registers into a single struct, store it within the
  engine error struct during error capturing.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1474379673-28326-1-git-send-email-imre.deak@intel.com
2016-09-21 15:32:36 +03:00
Dave Gordon
b20e3cfe4b drm/i915: clarify PMINTRMSK/pm_intr_keep usage
No functional changes; just renaming a bit, tweaking a datatype,
prettifying layout, and adding comments, in particular in the
GuC setup code that touches this data.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473711577-11454-2-git-send-email-david.s.gordon@intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-09-15 10:55:43 +01:00
Chris Wilson
32c2b4bda6 drm/i915: Avoid incrementing hangcheck whilst waiting for external fence
If we are waiting upon an external fence, from the pov of hangcheck the
engine is stuck on the last submitted seqno. Currently we give a small
increment to the hangcheck score in order to catch a stuck waiter /
driver. Now that we both have an independent wait hangcheck and may be
stuck waiting on an external fence, resetting the GPU has little effect
on that external fence. As we cannot advance by resetting, skip
incrementing the hangcheck score.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-19-chris@chris-wilson.co.uk
2016-09-09 14:23:07 +01:00
Chris Wilson
80b5bdbdcb drm/i915: Ignore valid but unknown semaphores
If we find a ring waiting on a semaphore for another assigned but not yet
emitted request, treat it as valid and waiting.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-18-chris@chris-wilson.co.uk
2016-09-09 14:23:07 +01:00
Chris Wilson
780f262a70 drm/i915: Replace wait-on-mutex with wait-on-bit in reset worker
Since we have a cooperative mode now with a direct reset, we can avoid
the contention on struct_mutex and instead try then sleep on the
I915_RESET_IN_PROGRESS bit. If the mutex is held and that bit is
cleared, all is fine. Otherwise, we sleep for a bit and try again. In
the worst case we sleep for an extra second waiting for the mutex to be
released (no one touching the GPU is allowed the struct_mutex whilst the
I915_RESET_IN_PROGRESS bit is set). But when we have a direct reset,
this allows us to clean up the reset worker faster.

v2: Remember to call wake_up_bit() after changing (for the faster wakeup
as promised)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-12-chris@chris-wilson.co.uk
2016-09-09 14:23:04 +01:00
Chris Wilson
221fe79945 drm/i915: Perform a direct reset of the GPU from the waiter
If a waiter is holding the struct_mutex, then the reset worker cannot
reset the GPU until the waiter returns. We do not want to return -EAGAIN
form i915_wait_request as that breaks delicate operations like
i915_vma_unbind() which often cannot be restarted easily, and returning
-EIO is just as useless (and has in the past proven dangerous). The
remaining WARN_ON(i915_wait_request) serve as a valuable reminder that
handling errors from an indefinite wait are tricky.

We can keep the current semantic that knowing after a reset is complete,
so is the request, by performing the reset ourselves if we hold the
mutex.

uevent emission is still handled by the reset worker, so it may appear
slightly out of order with respect to the actual reset (and concurrent
use of the device).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-11-chris@chris-wilson.co.uk
2016-09-09 14:23:04 +01:00
Chris Wilson
8af29b0c78 drm/i915: Separate out reset flags from the reset counter
In preparation for introducing a per-engine reset, we can first separate
the mixing of the reset state from the global reset counter.

The loss of atomicity in updating the reset state poses a small problem
for handling the waiters. For requests, this is solved by advancing the
seqno so that a waiter waking up after the reset knows the request is
complete. For pending flips, we still rely on the increment of the
global reset epoch (as well as the reset-in-progress flag) to signify
when the hardware was reset.

The advantage, now that we do not inspect the reset state during reset
itself i.e. we no longer emit requests during reset, is that we can use
the atomic updates of the state flags to ensure that only one reset
worker is active.

v2: Mika spotted that I transformed the i915_gem_wait_for_error() wakeup
into a waiter wakeup.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-6-git-send-email-arun.siluvery@linux.intel.com
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-7-chris@chris-wilson.co.uk
2016-09-09 14:23:02 +01:00
Chris Wilson
bafb0fced9 drm/i915: Make for_each_engine_masked() more compact and quicker
Rather than walk the full array of engines checking whether each is in
the mask in turn, we can use the mask to jump to the right engines. This
should quicker for a sparse array of engines or mask, whilst generating
smaller code:

   text	   data	    bss	    dec	    hex	filename
1251010	   4579	    800	1256389	 132bc5	drivers/gpu/drm/i915/i915.ko
1250530	   4579	    800	1255909	 1329e5	drivers/gpu/drm/i915/i915.ko

The downside is that we have to pass in a temporary, alas no C99
iterators yet.

[P.S. Joonas doesn't like having to pass extra temporaries into the
macro, and even less that I called them tmp. As yet, we haven't found a
macro that avoids passing in a temporary that is smaller. We probably
will get C99 iterators first!]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160827075401.16470-2-chris@chris-wilson.co.uk
2016-08-27 09:42:53 +01:00
Chris Wilson
34730fed9f drm/i915: Ignore stuck requests when considering hangs
If the engine isn't being retired (worker starvation?) then it is
possible for us to repeatedly observe that between consecutive
hangchecks the seqno on the ring to be the same and there remain
unretired requests. Ignore these completely and only regard the engine
as busy for the purpose of hang detection (not stall detection) if there
are outstanding breadcrumbs.

In recent history we have looked at using both the request and seqno as
indication of activity on the engine, but that was reduced to just
inspecting seqno in commit cffa781e59 ("drm/i915: Simplify check for
idleness in hangcheck"). However, in commit dcff85c844 ("drm/i915:
Enable i915_gem_wait_for_idle() without holding struct_mutex"), I made
the decision to use the new common lockless function, under the
assumption that request retirement was more frequent than hangcheck and
so we would not have a stuck busy check. The flaw there was in
forgetting that we accumulate the hang score, and so successive checks
seeing a stuck request, albeit with the GPU advancing elsewhere and so
not necessary the same stuck request, would eventually trigger the hang.

Fixes: dcff85c844 ("drm/i915: Enable i915_gem_wait_for_idle()...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160820145408.32180-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-22 13:09:11 +01:00
Chris Wilson
83348ba84e drm/i915: Move missed interrupt detection from hangcheck to breadcrumbs
In commit 2529d57050 ("drm/i915: Drop racy markup of missed-irqs from
idle-worker") the racy detection of missed interrupts was removed when
we went idle. This however opened up the issue that the stuck waiters
were not being reported, causing a test case failure. If we move the
stuck waiter detection out of hangcheck and into the breadcrumb
mechanims (i.e. the waiter) itself, we can avoid this issue entirely.
This leaves hangcheck looking for a stuck GPU (inspecting for request
advancement and HEAD motion), and breadcrumbs looking for a stuck
waiter - hopefully make both easier to understand by their segregation.

v2: Reduce the error message as we now run independently of hangcheck,
and the hanging batch used by igt also counts as a stuck waiter causing
extra warnings in dmesg.
v3: Move the breadcrumb's hangcheck kickstart to the first missed wait.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97104
Fixes: 2529d57050 (waiter"drm/i915: Drop racy markup of missed-irqs...")
Testcase: igt/drv_missed_irq
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470761272-1245-2-git-send-email-chris@chris-wilson.co.uk
2016-08-10 10:37:35 +01:00
Rodrigo Vivi
4194c088df drm/i915: Use drm official vblank_no_hw_counter callback.
No functional change. Instead of defining a new empty function
let's use what is available on drm.

It gets cleaner, and easy to read, and understand.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2016-08-05 09:40:07 -07:00
Chris Wilson
dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson
7e37f889b5 drm/i915: Rename struct intel_ringbuffer to struct intel_ring
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:16 +01:00
Chris Wilson
9930ca1ae7 drm/i915: Update a couple of hangcheck comments to talk about engines
We still have lots of comments that refer to the old ring when we mean
struct intel_engine_cs and its hardware correspondence. This patch fixes
an instance inside hangcheck to talk about engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-10-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469606850-28659-5-git-send-email-chris@chris-wilson.co.uk
2016-07-27 16:23:05 +01:00
Chris Wilson
f2f0ed718b drm/i915: Rename ring->virtual_start as ring->vaddr
Just a different colour to better match virtual addresses elsewhere.

s/ring->virtual_start/ring->vaddr/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-9-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-8-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:39 +01:00
Chris Wilson
406ea8d22f drm/i915: Treat ringbuffer writes as write to normal memory
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (but even if it were not, it is still an uncached register write
and so serialising anyway.).

For simplicity, let's ignore the iomem annotation.

v2: Remove iomem from ringbuffer->virtual_address
v3: And for good measure add iomem elsewhere to keep sparse happy

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:14 +01:00
Chris Wilson
29ecd78d3b drm/i915: Define a separate variable and control for RPS waitboost frequency
To allow the user finer control over waitboosting, allow them to set the
frequency we request for the boost. This also them allows to effectively
disable the boosting by setting the boost request to a low frequency.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-14 15:22:58 +01:00
Rodrigo Vivi
22dea0be50 drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
Some Kabylake SKUs are going to use Kabypoint PCH.
It is mainly for Halo and DT ones.

>From our specs it doesn't seem that KBP brings
any change on the display south engine. So let's consider
this as a continuation of SunrisePoint, i.e., SPT+.

Since it is easy to get confused by a letter change:
KBL = Kabylake - CPU/GPU codename.
KBP = Kabypoint - PCH codename.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96826
Link: http://patchwork.freedesktop.org/patch/msgid/1467418032-15167-1-git-send-email-rodrigo.vivi@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2016-07-07 10:01:37 -07:00
Chris Wilson
aca34b6e1c drm/i915: Group the irq breadcrumb variables into the same cacheline
As we inspect both the tasklet (to check for an active bottom-half) and
set the irq-posted flag at the same time (both in the interrupt handler
and then in the bottom-halt), group those two together into the same
cacheline. (Not having total control over placement of the struct means
we can't guarantee the cacheline boundary, we need to align the kmalloc
and then each struct, but the grouping should help.)

v2: Try a couple of different names for the state touched by the user
interrupt handler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk
2016-07-06 12:47:39 +01:00
Chris Wilson
91c8a326a1 drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05 11:58:45 +01:00
Chris Wilson
2b2842887d drm/i915: Clean up GPU hang message
Remove some redundant kernel messages as we deduce a hung GPU and
capture the error state.

v2: Fix "hang" vs "no progress" message whilst I was there
v3: s/snprintf/scnprintf/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05 11:25:11 +01:00
Chris Wilson
b1379d4964 drm/i915: Replace lockless_dereference(bool) with READ_ONCE()
After Joonas complained about using READ_ONCE() on the only use of the
variable in the function, where the intent was to simply document that
the read was intentionally racy and unlocked, I switched the READ_ONCE()
over to lockless_dereference(). However, in linux-next that has a
stronger type-check to only allow pointers and is no longer
interchangeable with READ_ONCE(), see commit 331b6d8c7a
("locking/barriers: Validate lockless_dereference() is used on a pointer
type")

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 67d97da349 ("drm/i915: Only start retire worker when idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05 10:58:07 +01:00
Chris Wilson
fac5e23e3c drm/i915: Mass convert dev->dev_private to to_i915(dev)
Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().

   text    data     bss     dec     hex filename
1073824    4562     416 1078802  107612 drivers/gpu/drm/i915/i915.ko
1068976    4562     416 1073954  106322 drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:

@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 12:54:07 +01:00
Chris Wilson
c33d247d0e drm/i915: Flush the RPS bottom-half when the GPU idles
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e. between gen6_rps_busy() and gen6_rps_idle()).

v2: Avoid recursively using the i915->wq - RPS does not touch the
struct_mutex so has no place being on the ordered i915->wq.
v3: Enable/disable interrupts for RPS busy/idle in order to prevent
further HW access from RPS outside of the wakeref.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=89728
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-6-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:23 +01:00
Chris Wilson
67d97da349 drm/i915: Only start retire worker when idle
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct locking in the worker to make the hot path of request
submission cheap. To keep the symmetry and keep hangcheck strictly bound
by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle
worker before dropping the wakelock.

v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick
the waiter.
v3: Remove the wakeref assertion squelching (now we hold a wakeref for
the hangcheck, any rpm error there is genuine).
v4: To prevent excess work when retiring requests, we split the busy
flag into two, a boolean to denote whether we hold the wakeref and a
bitmask of active engines.
v5: Reorder cancelling hangcheck upon idling to avoid a race where we
might cancel a hangcheck after being preempted by a new task

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=88437
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 08:18:19 +01:00
Chris Wilson
c5a7b5aace drm/i915: Remove debug noise on detecting fault-injection of missed interrupts
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.

Note that I have chose to remove it entirely because when we detect the
"missed interrupt" is irrelevant and the message contains no more
information than we glean from looking in debugfs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-20-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:04:17 +01:00
Chris Wilson
31bb59cc01 drm/i915: Move the get/put irq locking into the caller
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.

For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions to enable/disable respectively and are
able to review the use of posting reads. We only require the
serialisation with hardware when enabling the interrupt (i.e. so we
cannot miss an interrupt by going to sleep before the hardware truly
enables it).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:04:16 +01:00
Chris Wilson
3d5564e910 drm/i915: Only apply one barrier after a breadcrumb interrupt is posted
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).

v2: Use cmpxchg to replace READ_ONCE/WRITE_ONCE for more explicit
control of the ordering wrt to interrupt generation and interrupt
checking in the bottom-half.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-14-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:54 +01:00
Chris Wilson
f8973c217f drm/i915: Add a delay between interrupt and inspecting the final seqno (ilk)
On Ironlake, there is no command nor register to ensure that the write
from a MI_STORE command is completed (and coherent on the CPU) before the
command parser continues. This means that the ordering between the seqno
write and the subsequent user interrupt is undefined (like gen6+). So to
ensure that the seqno write is completed after the final user interrupt
we need to delay the read sufficiently to allow the write to complete.
This delay is undefined by the bspec, and empirically requires 75us even
though a register read combined with a clflush is less than 500ns. Hence,
the delay is due to an on-chip buffer rather than the latency of the write
to memory.

Note that the render ring controls this by filling the PIPE_CONTROL fifo
with stalling commands that force the earliest pipe-control with the
seqno to be completed before the command parser continues. Given that we
need a barrier operation for BSD, we may as well forgo the extra
per-batch latency by using a common per-interrupt barrier.

Studying the impact of adding the usleep shows that in both sequences of
and individual synchronous no-op batches is negligible for the media
engine (where the write now is unordered with the interrupt). Converting
the render engine over from the current glutton of pie-controls over to
the per-interrupt delays speeds up both the sequential and individual
synchronous no-ops by 20% and 60%, respectively. This speed up holds
even when looking at the throughput of small copies (4KiB->4MiB), both
serial and synchronous, by about 20%. This is because despite adding a
significant delay to the interrupt, in all likelihood we will see the
seqno write without having to apply the barrier (only in the rare corner
cases where the write is delayed on the last required is the delay
necessary).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94307
Testcase: igt/gem_sync #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-12-git-send-email-chris@chris-wilson.co.uk
2016-07-01 21:00:53 +01:00
Chris Wilson
1b7744e7ba drm/i915: Use HWS for seqno tracking everywhere
By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

v2: Fix semaphore_passed() to look at the signaling engine (not the
waiter's)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-8-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:48 +01:00
Chris Wilson
688e6c7258 drm/i915: Slaughter the thundering i915_wait_request herd
One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one.

Ideally, we only want one client to wake up after the interrupt and
check its request for completion. Since the requests must retire in
order, we can select the first client on the oldest request to be woken.
Once that client has completed his wait, we can then wake up the
next client and so on. However, all clients then incur latency as every
process in the chain may be delayed for scheduling - this may also then
cause some priority inversion. To reduce the latency, when a client
is added or removed from the list, we scan the tree for completed
seqno and wake up all the completed waiters in parallel.

Using igt/benchmarks/gem_latency, we can demonstrate this effect. The
benchmark measures the number of GPU cycles between completion of a
batch and the client waking up from a call to wait-ioctl. With many
concurrent waiters, with each on a different request, we observe that
the wakeup latency before the patch scales nearly linearly with the
number of waiters (before external factors kick in making the scaling much
worse). After applying the patch, we can see that only the single waiter
for the request is being woken up, providing a constant wakeup latency
for every operation. However, the situation is not quite as rosy for
many waiters on the same request, though to the best of my knowledge this
is much less likely in practice. Here, we can observe that the
concurrent waiters incur extra latency from being woken up by the
solitary bottom-half, rather than directly by the interrupt. This
appears to be scheduler induced (having discounted adverse effects from
having a rbtree walk/erase in the wakeup path), each additional
wake_up_process() costs approximately 1us on big core. Another effect of
performing the secondary wakeups from the first bottom-half is the
incurred delay this imposes on high priority threads - rather than
immediately returning to userspace and leaving the interrupt handler to
wake the others.

To offset the delay incurred with additional waiters on a request, we
could use a hybrid scheme that did a quick read in the interrupt handler
and dequeued all the completed waiters (incurring the overhead in the
interrupt handler, not the best plan either as we then incur GPU
submission latency) but we would still have to wake up the bottom-half
every time to do the heavyweight slow read. Or we could only kick the
waiters on the seqno with the same priority as the current task (i.e. in
the realtime waiter scenario, only it is woken up immediately by the
interrupt and simply queues the next waiter before returning to userspace,
minimising its delay at the expense of the chain, and also reducing
contention on its scheduler runqueue). This is effective at avoid long
pauses in the interrupt handler and at avoiding the extra latency in
realtime/high-priority waiters.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
v5: Fix race in locklessly checking waiter status and kicking the task on
adding a new waiter.
v6: Fix deciding when to force the timer to hide missing interrupts.
v7: Move the bottom-half from the kthread to the first client process.
v8: Reword a few comments
v9: Break the busy loop when the interrupt is unmasked or has fired.
v10: Comments, unnecessary churn, better debugging from Tvrtko
v11: Wake all completed waiters on removing the current bottom-half to
reduce the latency of waking up a herd of clients all waiting on the
same request.
v12: Rearrange missed-interrupt fault injection so that it works with
igt/drv_missed_irq_hang
v13: Rename intel_breadcrumb and friends to intel_wait in preparation
for signal handling.
v14: RCU commentary, assert_spin_locked
v15: Hide BUG_ON behind the compiler; report on gem_latency findings.
v16: Sort seqno-groups by priority so that first-waiter has the highest
task priority (and so avoid priority inversion).
v17: Add waiters to post-mortem GPU hang state.
v18: Return early for a completed wait after acquiring the spinlock.
Avoids adding ourselves to the tree if the is already complete, and
skips the awkward question of why we don't do completion wakeups for
waits earlier than or equal to ourselves.
v19: Prepare for init_breadcrumbs to fail. Later patches may want to
allocate during init, so be prepared to propagate back the error code.

Testcase: igt/gem_concurrent_blit
Testcase: igt/benchmarks/gem_latency
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> #v18
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-6-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:58:43 +01:00
Chris Wilson
1f15b76f1e drm/i915: Separate GPU hang waitqueue from advance
Currently __i915_wait_request uses a per-engine wait_queue_t for the dual
purpose of waking after the GPU advances or for waking after an error.
In the future, we may add even more wake sources and require greater
separation, but for now we can conceptually simplify wakeups by separating
the two sources. In particular, this allows us to use different wait-queues
(e.g. one on the engine advancement, a global one for errors and one on
each requests) without any hassle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-5-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:31 +01:00
Chris Wilson
26a02b8fc3 drm/i915: Make queueing the hangcheck work inline
Since the function is a small wrapper around schedule_delayed_work(),
move it inline to remove the function call overhead for the principle
caller.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-4-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:30 +01:00
Chris Wilson
7774002586 drm/i915: Remove the dedicated hangcheck workqueue
The queue only ever contains at most one item and has no special flags.
It is just a very simple wrapper around the system-wq - a complication
with no benefits.

v2: Use the system_long_wq as we may wish to capture the error state
after detecting the hang - which may take a bit of time.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-3-git-send-email-chris@chris-wilson.co.uk
2016-07-01 20:42:29 +01:00