Commit Graph

29663 Commits

Author SHA1 Message Date
Maarten Lankhorst
b707654636 drm/i915: Cleanup crt disable sequence on hsw+
Instead of iterating overthe connectors manually, run the last part of
DDI disabling inside the crt post disable function.

This was meant to be addressed before submitting the other commit,
but I missed the review comments.

Fixes: fd6bbda9c7 ("drm/i915: Pass crtc_state and connector_state to encoder functions")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471961888-10771-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[mlankhorst: Fix extra whitespace between functions.]
2016-08-24 09:49:10 +02:00
Maarten Lankhorst
496b0fc370 drm/i915: Create a intel_encoder_find_connector helper function.
This makes the code in intel_sanitize_encoder slightly more readable.
This was meant to be addressed in fd6bbda9c7, but I missed that
review comment.

Fixes: fd6bbda9c7 ("drm/i915: Pass crtc_state and connector_state to encoder functions")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471961888-10771-1-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[mlankhorst: Fix unused variable reported by kbuild.]
2016-08-24 09:49:10 +02:00
Lyude
62e0fb8801 drm/i915/skl: Update plane watermarks atomically during plane updates
Thanks to Ville for suggesting this as a potential solution to pipe
underruns on Skylake.

On Skylake all of the registers for configuring planes, including the
registers for configuring their watermarks, are double buffered. New
values written to them won't take effect until said registers are
"armed", which is done by writing to the PLANE_SURF (or in the case of
cursor planes, the CURBASE register) register.

With this in mind, up until now we've been updating watermarks on skl
like this:

  non-modeset {
   - calculate (during atomic check phase)
   - finish_atomic_commit:
     - intel_pre_plane_update:
        - intel_update_watermarks()
     - {vblank happens; new watermarks + old plane values => underrun }
     - drm_atomic_helper_commit_planes_on_crtc:
        - start vblank evasion
        - write new plane registers
        - end vblank evasion
  }

  or

  modeset {
   - calculate (during atomic check phase)
   - finish_atomic_commit:
     - crtc_enable:
        - intel_update_watermarks()
     - {vblank happens; new watermarks + old plane values => underrun }
     - drm_atomic_helper_commit_planes_on_crtc:
        - start vblank evasion
        - write new plane registers
        - end vblank evasion
  }

Now we update watermarks atomically like this:

  non-modeset {
   - calculate (during atomic check phase)
   - finish_atomic_commit:
     - intel_pre_plane_update:
        - intel_update_watermarks() (wm values aren't written yet)
     - drm_atomic_helper_commit_planes_on_crtc:
        - start vblank evasion
        - write new plane registers
        - write new wm values
        - end vblank evasion
  }

  modeset {
   - calculate (during atomic check phase)
   - finish_atomic_commit:
     - crtc_enable:
        - intel_update_watermarks() (actual wm values aren't written
          yet)
     - drm_atomic_helper_commit_planes_on_crtc:
        - start vblank evasion
        - write new plane registers
	- write new wm values
        - end vblank evasion
  }

So this patch moves all of the watermark writes into the right place;
inside of the vblank evasion where we update all of the registers for
each plane. While this patch doesn't fix everything, it does allow us to
update the watermark values in the way the hardware expects us to.

Changes since original patch series:
 - Remove mutex_lock/mutex_unlock since they don't do anything and we're
   not touching global state
 - Move skl_write_cursor_wm/skl_write_plane_wm functions into
   intel_pm.c, make externally visible
 - Add skl_write_plane_wm calls to skl_update_plane
 - Fix conditional for for loop in skl_write_plane_wm (level < max_level
   should be level <= max_level)
 - Make diagram in commit more accurate to what's actually happening
 - Add Fixes:

Changes since v1:
 - Use IS_GEN9() instead of IS_SKYLAKE() since these fixes apply to more
   then just Skylake
 - Update description to make it clear this patch doesn't fix everything
 - Check if pipes were actually changed before writing watermarks

Changes since v2:
 - Write PIPE_WM_LINETIME during vblank evasion

Changes since v3:
 - Rebase against new SAGV patch changes

Changes since v4:
 - Add a parameter to choose what skl_wm_values struct to use when
   writing new plane watermarks

Changes since v5:
 - Remove cursor ddb entry write in skl_write_cursor_wm(), defer until
   patch 6
 - Write WM_LINETIME in intel_begin_crtc_commit()

Changes since v6:
 - Remove redundant dirty_pipes check in skl_write_plane_wm (we check
   this in all places where we call this function, and it was supposed
   to have been removed earlier anyway)
 - In i9xx_update_cursor(), use dev_priv->info.gen >= 9 instead of
   IS_GEN9(dev_priv). We do this everywhere else and I'd imagine this
   needs to be done for gen10 as well

Changes since v7:
 - Fix rebase fail (unused variable obj)
 - Make struct skl_wm_values *wm const
 - Fix indenting
 - Use INTEL_GEN() instead of dev_priv->info.gen

Changes since v8:
 - Don't forget calls to skl_write_plane_wm() when disabling planes
 - Use INTEL_GEN(), not INTEL_INFO()->gen in intel_begin_crtc_commit()

Fixes: 2d41c0b59a ("drm/i915/skl: SKL Watermark Computation")
Signed-off-by: Lyude <cpaul@redhat.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471884608-10671-1-git-send-email-cpaul@redhat.com
Link: http://patchwork.freedesktop.org/patch/msgid/1471884608-10671-1-git-send-email-cpaul@redhat.com
2016-08-23 12:04:59 +02:00
Maarten Lankhorst
1f0017f6f3 drm/i915: Use more atomic state in intel_color.c
crtc_state is already passed around, use it instead of crtc->config.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-16-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:58:56 +02:00
Maarten Lankhorst
85cb48a165 drm/i915: Convert intel_dp to use atomic state
Slightly less straightforward. Some of the drrs calls are done from
workers or from intel_ddi.c, pass along crtc_state when we can,
or crtc->config when we can't.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-15-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:57:07 +02:00
Maarten Lankhorst
1e7bfa0b0e drm/i915: Convert intel_dp_mst to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-14-git-send-email-maarten.lankhorst@linux.intel.com
[mlankhorst: Address bikeshed.]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:56:53 +02:00
Maarten Lankhorst
d468e21e8c drm/i915: Convert intel_lvds to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-13-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[mlankhorst: Small fixup wrt register renames.]
2016-08-23 11:29:45 +02:00
Maarten Lankhorst
f9fe053064 drm/i915: Convert intel_sdvo to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-12-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:21:51 +02:00
Maarten Lankhorst
5eff0edf32 drm/i915: Convert intel_dsi to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-11-git-send-email-maarten.lankhorst@linux.intel.com
[mlankhorst: Unbreak bxt_dsi_get_pipe_config]
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:21:39 +02:00
Maarten Lankhorst
ae9a911bb0 drm/i915: Convert intel_dvo to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-10-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:08:29 +02:00
Maarten Lankhorst
225cc34840 drm/i915: Convert intel_crt to use atomic state
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-9-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:08:08 +02:00
Maarten Lankhorst
1189e4f4d8 drm/i915: Remove unused loop from intel_dp_mst_compute_config
Now that conn_state is passed in as argument to compute_config, it's
guaranteed that there is a connector for the argument. The code that
looks for the connector is now dead, and completely unused. Delete it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-8-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:07:33 +02:00
Maarten Lankhorst
0a478c27db drm/i915: Make encoder->compute_config take the connector state
Some places iterate over connector_state to find the right
connector, pass it along as argument.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-7-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:07:23 +02:00
Maarten Lankhorst
fd6bbda9c7 drm/i915: Pass crtc_state and connector_state to encoder functions
This is mostly code churn, with exception of a few places:
- intel_display.c has changes in intel_sanitize_encoder
- intel_ddi.c has intel_ddi_fdi_disable calling intel_ddi_post_disable,
  and required a function change. Also affects intel_display.c
- intel_dp_mst.c passes a NULL crtc_state and conn_state to
  intel_ddi_post_disable for shutting down the real encoder.

  If we would pass conn_state, then conn_state->connector !=
  intel_dig_port->connector and conn_state->best_encoder !=
  to_intel_encoder(intel_dig_port).

  We also shouldn't pass crtc_state, because in that case the
  disabling sequence may potentially be different depending on
  which crtc is disabled last. Nice way to introduce bugs.

No other functional changes are done, diff stat is already huge.
Each encoder type will need to be fixed to use the atomic states
separately.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-6-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:06:50 +02:00
Maarten Lankhorst
fb1c98b181 drm/i915: Walk over encoders in crtc enable/disable using atomic state.
This cleans up another possible use of the connector list,
encoder->crtc is legacy state and should not be used.

With the atomic state as argument it's easy to find the encoder from
the connector it belongs to.

intel_opregion_notify_encoder is a noop for !HAS_DDI, so it's harmless
to unconditionally include it in encoder enable/disable.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-5-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 11:00:44 +02:00
Maarten Lankhorst
c376399c83 drm/i915: Remove unused mode_set hook from encoder
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-4-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 10:56:36 +02:00
Maarten Lankhorst
4a80655827 drm/i915: Pass atomic state to crtc enable/disable functions
This is required for supporting nonblocking modesets. Iterating over
the connector list will no longer be allowed when we don't hold
connection_mutex, so we have to use the atomic state.

Fix disable_noatomic by populating the minimal state required to
disable a connector.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-3-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 10:56:18 +02:00
Maarten Lankhorst
a79e8cc792 drm/i915: handle DP_MST correctly in bxt_get_dpll
No idea if it supports it, but this is the minimum required from get_dpll.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470755054-32699-2-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-23 10:55:54 +02:00
Chris Wilson
4e6c2d58ba drm/i915: Take forcewake once for the entire GMBUS transaction
As we do many register reads within a very short period of time, hold
the GMBUS powerwell from start to finish.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819164503.17845-1-chris@chris-wilson.co.uk
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-08-22 18:42:44 +01:00
Chris Wilson
637ee29eff drm/i915: Fix nesting of filelist_mutex vs struct_mutex in i915_ppgtt_info
An unlikely ABBA deadlock in debugfs that no one has reported.

[  284.922349] ======================================================
[  284.922355] [ INFO: possible circular locking dependency detected ]
[  284.922361] 4.8.0-rc2+ #430 Tainted: G        W
[  284.922366] -------------------------------------------------------
[  284.922371] cat/1197 is trying to acquire lock:
[  284.922376]  (&dev->filelist_mutex){+.+...}, at: [<ffffffffa0055ba2>] i915_ppgtt_info+0x82/0x390 [i915]
[  284.922423]
[  284.922423] but task is already holding lock:
[  284.922429]  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0055b55>] i915_ppgtt_info+0x35/0x390 [i915]
[  284.922465]
[  284.922465] which lock already depends on the new lock.
[  284.922465]
[  284.922471]
[  284.922471] the existing dependency chain (in reverse order) is:
[  284.922477]
-> #1 (&dev->struct_mutex){+.+.+.}:
[  284.922493]        [<ffffffff81087710>] lock_acquire+0x60/0x80
[  284.922505]        [<ffffffff8143e96f>] mutex_lock_nested+0x5f/0x360
[  284.922520]        [<ffffffffa004f877>] print_context_stats+0x37/0xf0 [i915]
[  284.922549]        [<ffffffffa00535f5>] i915_gem_object_info+0x265/0x490 [i915]
[  284.922581]        [<ffffffff81144491>] seq_read+0xe1/0x3b0
[  284.922592]        [<ffffffff811f77b3>] full_proxy_read+0x83/0xb0
[  284.922604]        [<ffffffff8111ba03>] __vfs_read+0x23/0x110
[  284.922616]        [<ffffffff8111c9b9>] vfs_read+0x89/0x110
[  284.922626]        [<ffffffff8111dbf4>] SyS_read+0x44/0xa0
[  284.922636]        [<ffffffff81442be9>] entry_SYSCALL_64_fastpath+0x1c/0xac
[  284.922648]
-> #0 (&dev->filelist_mutex){+.+...}:
[  284.922667]        [<ffffffff810871fc>] __lock_acquire+0x10fc/0x1270
[  284.922678]        [<ffffffff81087710>] lock_acquire+0x60/0x80
[  284.922689]        [<ffffffff8143e96f>] mutex_lock_nested+0x5f/0x360
[  284.922701]        [<ffffffffa0055ba2>] i915_ppgtt_info+0x82/0x390 [i915]
[  284.922729]        [<ffffffff81144491>] seq_read+0xe1/0x3b0
[  284.922739]        [<ffffffff811f77b3>] full_proxy_read+0x83/0xb0
[  284.922750]        [<ffffffff8111ba03>] __vfs_read+0x23/0x110
[  284.922761]        [<ffffffff8111c9b9>] vfs_read+0x89/0x110
[  284.922771]        [<ffffffff8111dbf4>] SyS_read+0x44/0xa0
[  284.922781]        [<ffffffff81442be9>] entry_SYSCALL_64_fastpath+0x1c/0xac
[  284.922793]
[  284.922793] other info that might help us debug this:
[  284.922793]
[  284.922809]  Possible unsafe locking scenario:
[  284.922809]
[  284.922818]        CPU0                    CPU1
[  284.922825]        ----                    ----
[  284.922831]   lock(&dev->struct_mutex);
[  284.922842]                                lock(&dev->filelist_mutex);
[  284.922854]                                lock(&dev->struct_mutex);
[  284.922865]   lock(&dev->filelist_mutex);
[  284.922875]
[  284.922875]  *** DEADLOCK ***
[  284.922875]
[  284.922888] 3 locks held by cat/1197:
[  284.922895]  #0:  (debugfs_srcu){......}, at: [<ffffffff811f7730>] full_proxy_read+0x0/0xb0
[  284.922919]  #1:  (&p->lock){+.+.+.}, at: [<ffffffff811443e8>] seq_read+0x38/0x3b0
[  284.922942]  #2:  (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0055b55>] i915_ppgtt_info+0x35/0x390 [i915]
[  284.922983]

Fixes: 1d2ac403ae ("drm: Protect dev->filelist with its own mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822132820.21725-1-chris@chris-wilson.co.uk
2016-08-22 15:08:05 +01:00
Chris Wilson
34730fed9f drm/i915: Ignore stuck requests when considering hangs
If the engine isn't being retired (worker starvation?) then it is
possible for us to repeatedly observe that between consecutive
hangchecks the seqno on the ring to be the same and there remain
unretired requests. Ignore these completely and only regard the engine
as busy for the purpose of hang detection (not stall detection) if there
are outstanding breadcrumbs.

In recent history we have looked at using both the request and seqno as
indication of activity on the engine, but that was reduced to just
inspecting seqno in commit cffa781e59 ("drm/i915: Simplify check for
idleness in hangcheck"). However, in commit dcff85c844 ("drm/i915:
Enable i915_gem_wait_for_idle() without holding struct_mutex"), I made
the decision to use the new common lockless function, under the
assumption that request retirement was more frequent than hangcheck and
so we would not have a stuck busy check. The flaw there was in
forgetting that we accumulate the hang score, and so successive checks
seeing a stuck request, albeit with the GPU advancing elsewhere and so
not necessary the same stuck request, would eventually trigger the hang.

Fixes: dcff85c844 ("drm/i915: Enable i915_gem_wait_for_idle()...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160820145408.32180-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-22 13:09:11 +01:00
Chris Wilson
bb8f9cffad drm/i915: Allow DMA pagetables to use highmem
As we never need to directly access the pages we allocate for scratch and
the pagetables, and always remap them into the GTT through the dma
remapper, we do not need to limit the allocations to lowmem i.e. we can
pass in the __GFP_HIGHMEM flag to the page allocation.

For backwards compatibility, e.g. certain old GPUs not liking highmem
for certain functions that may be accidentally mapped to the scratch
page by userspace, keep the GMCH probe as only allocating from DMA32.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-22 12:19:52 +01:00
Chris Wilson
8bcdd0f756 drm/i915: Embed the scratch page struct into each VM
As the scratch page is no longer shared between all VM, and each has
their own, forgo the small allocation and simply embed the scratch page
struct into the i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-22 12:19:52 +01:00
David Weinehall
36cdd0138b drm/i915: debugfs spring cleaning
Just like with sysfs, we do some major overhaul.

Pass dev_priv instead of dev to all feature macros (IS_, HAS_,
INTEL_, etc.). This has the side effect that a bunch of functions
now get dev_priv passed instead of dev.

All calls to INTEL_INFO()->gen have been replaced with
INTEL_GEN().

We want access to to_i915(node->minor->dev) in a lot of places,
so add the node_to_i915() helper to accommodate for this.

Finally, we have quite a few cases where we get a void * pointer,
and need to cast it to drm_device *, only to run to_i915() on it.
Add cast_to_i915() to do this.

v2: Don't introduce extra dev (Chris)

v3: Make pipe_crc_info have a pointer to drm_i915_private instead of
    drm_device. This saves a bit of space, since we never use
    drm_device anywhere in these functions.

    Also some minor fixup that I missed in the previous version.

v4: Changed the code a bit so that dev_priv is passed directly
    to various functions, thus removing the need for the
    cast_to_i915() helper. Also did some additional cleanup.

v5: Additional cleanup of newly introduced changes.

v6: Rebase again because of conflict.

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822105931.pcbe2lpsgzckzboa@boom
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
David Weinehall
52a05c302b drm/i915: pdev cleanup
In an effort to simplify things for a future push of dev_priv instead
of dev wherever possible, always take pdev via dev_priv where
feasible, eliminating the direct access from dev. Right now this
only eliminates a few cases of dev, but it also obviates that we pass
dev into a lot of functions where dev_priv would be the more obvious
choice.

v2: Fixed one more place missing in the previous patch set

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-5-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
David Weinehall
694c282845 drm/i915: i915_sysfs.c cleanup
Various cleanup for i915_sysfs.c; we now use dev_priv whenever
possible. The kdev_to_drm_minor() helper function has been
replaced by one that converts from struct device *
to struct drm_i915_private *.

We already have a seemingly identical helper (kdev_to_i915())
in i915_drv.h. But that one cannot be used here.
Unlike the version in i915_drv.h, this helper
reaches i915 through drm_minor.

v2: Rename kdev_to_i915_dm() to kdev_minor_to_i915() (Chris)

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-4-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
David Weinehall
c49d13ee13 drm/i915: consistent struct device naming
We currently have a mix of struct device *device, struct device *kdev,
and struct device *dev (the latter forcing us to refer to
struct drm_device as something else than the normal dev).

To simplify things, always use kdev when referring to struct device.

v2: Replace the dev_to_drm_minor() macro with the inline function
    kdev_to_drm_minor().

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-3-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
David Weinehall
351c3b53e7 drm/i915: cosmetic fixes to i915_drv.h
Fix minor whitespace issues plus a typo.

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-2-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
Maarten Lankhorst
536ab3ca19 drm/i915: Fix botched merge that downgrades CSR versions.
Merge commit 5e580523d9 reverts the version bumping parts of
commit 4aa7fb9c3c. Bump the versions again and request the specific
firmware version.

The currently recommended versions are: SKL 1.26, KBL 1.01 and BXT 1.07.

Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97242
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: 5e580523d9 ("Backmerge tag 'v4.7' into drm-next")
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471266567-22443-1-git-send-email-maarten.lankhorst@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-08-22 12:54:50 +02:00
Lyude
05a76d3d6a drm/i915/skl: Ensure pipes with changed wms get added to the state
If we're enabling a pipe, we'll need to modify the watermarks on all
active planes. Since those planes won't be added to the state on
their own, we need to add them ourselves.

Signed-off-by: Lyude <cpaul@redhat.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-6-git-send-email-cpaul@redhat.com
2016-08-22 12:54:41 +02:00
Matt Roper
2722efb90b drm/i915/gen9: Only copy WM results for changed pipes to skl_hw
When we write watermark values to the hardware, those values are stored
in dev_priv->wm.skl_hw.  However with recent watermark changes, the
results structure we're copying from only contains valid watermark and
DDB values for the pipes that are actually changing; the values for
other pipes remain 0.  Thus a blind copy of the entire skl_wm_values
structure will clobber the values for unchanged pipes...we need to be
more selective and only copy over the values for the changing pipes.

This mistake was hidden until recently due to another bug that caused us
to erroneously re-calculate watermarks for all active pipes rather than
changing pipes.  Only when that bug was fixed was the impact of this bug
discovered (e.g., modesets failing with "Requested display configuration
exceeds system watermark limitations" messages and leaving watermarks
non-functional, even ones initiated by intel_fbdev_restore_mode).

Changes since v1:
 - Add a function for copying a pipe's wm values
   (skl_copy_wm_for_pipe()) so we can reuse this later

Fixes: 734fa01f3a ("drm/i915/gen9: Calculate watermarks during atomic 'check' (v2)")
Fixes: 9b61302274 ("drm/i915/gen9: Re-allocate DDB only for changed pipes")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Cc: stable@vger.kernel.org
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-4-git-send-email-cpaul@redhat.com
2016-08-22 12:54:41 +02:00
Lyude
656d1b89e5 drm/i915/skl: Add support for the SAGV, fix underrun hangs
Since the watermark calculations for Skylake are still broken, we're apt
to hitting underruns very easily under multi-monitor configurations.
While it would be lovely if this was fixed, it's not. Another problem
that's been coming from this however, is the mysterious issue of
underruns causing full system hangs. An easy way to reproduce this with
a skylake system:

- Get a laptop with a skylake GPU, and hook up two external monitors to
  it
- Move the cursor from the built-in LCD to one of the external displays
  as quickly as you can
- You'll get a few pipe underruns, and eventually the entire system will
  just freeze.

After doing a lot of investigation and reading through the bspec, I
found the existence of the SAGV, which is responsible for adjusting the
system agent voltage and clock frequencies depending on how much power
we need. According to the bspec:

"The display engine access to system memory is blocked during the
 adjustment time. SAGV defaults to enabled. Software must use the
 GT-driver pcode mailbox to disable SAGV when the display engine is not
 able to tolerate the blocking time."

The rest of the bspec goes on to explain that software can simply leave
the SAGV enabled, and disable it when we use interlaced pipes/have more
then one pipe active.

Sure enough, with this patchset the system hangs resulting from pipe
underruns on Skylake have completely vanished on my T460s. Additionally,
the bspec mentions turning off the SAGV	with more then one pipe enabled
as a workaround for display underruns. While this patch doesn't entirely
fix that, it looks like it does improve the situation a little bit so
it's likely this is going to be required to make watermarks on Skylake
fully functional.

This will still need additional work in the future: we shouldn't be
enabling the SAGV if any of the currently enabled planes can't enable WM
levels that introduce latencies >= 30 µs.

Changes since v11:
 - Add skl_can_enable_sagv()
 - Make sure we don't enable SAGV when not all planes can enable
   watermarks >= the SAGV engine block time. I was originally going to
   save this for later, but I recently managed to run into a machine
   that was having problems with a single pipe configuration + SAGV.
 - Make comparisons to I915_SKL_SAGV_NOT_CONTROLLED explicit
 - Change I915_SAGV_DYNAMIC_FREQ to I915_SAGV_ENABLE
 - Move printks outside of mutexes
 - Don't print error messages twice
Changes since v10:
 - Apparently sandybridge_pcode_read actually writes values and reads
   them back, despite it's misleading function name. This means we've
   been doing this mostly wrong and have been writing garbage to the
   SAGV control. Because of this, we no longer attempt to read the SAGV
   status during initialization (since there are no helpers for this).
 - mlankhorst noticed that this patch was breaking on some very early
   pre-release Skylake machines, which apparently don't allow you to
   disable the SAGV. To prevent machines from failing tests due to SAGV
   errors, if the first time we try to control the SAGV results in the
   mailbox indicating an invalid command, we just disable future attempts
   to control the SAGV state by setting dev_priv->skl_sagv_status to
   I915_SKL_SAGV_NOT_CONTROLLED and make a note of it in dmesg.
 - Move mutex_unlock() a little higher in skl_enable_sagv(). This
   doesn't actually fix anything, but lets us release the lock a little
   sooner since we're finished with it.
Changes since v9:
 - Only enable/disable sagv on Skylake
Changes since v8:
 - Add intel_state->modeset guard to the conditional for
   skl_enable_sagv()
Changes since v7:
 - Remove GEN9_SAGV_LOW_FREQ, replace with GEN9_SAGV_IS_ENABLED (that's
   all we use it for anyway)
 - Use GEN9_SAGV_IS_ENABLED instead of 0x1 for clarification
 - Fix a styling error that snuck past me
Changes since v6:
 - Protect skl_enable_sagv() with intel_state->modeset conditional in
   intel_atomic_commit_tail()
Changes since v5:
 - Don't use is_power_of_2. Makes things confusing
 - Don't use the old state to figure out whether or not to
   enable/disable the sagv, use the new one
 - Split the loop in skl_disable_sagv into it's own function
 - Move skl_sagv_enable/disable() calls into intel_atomic_commit_tail()
Changes since v4:
 - Use is_power_of_2 against active_crtcs to check whether we have > 1
   pipe enabled
 - Fix skl_sagv_get_hw_state(): (temp & 0x1) indicates disabled, 0x0
   enabled
 - Call skl_sagv_enable/disable() from pre/post-plane updates
Changes since v3:
 - Use time_before() to compare timeout to jiffies
Changes since v2:
 - Really apply minor style nitpicks to patch this time
Changes since v1:
 - Added comments about this probably being one of the requirements to
   fixing Skylake's watermark issues
 - Minor style nitpicks from Matt Roper
 - Disable these functions on Broxton, since it doesn't have an SAGV

Signed-off-by: Lyude <cpaul@redhat.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-3-git-send-email-cpaul@redhat.com
[mlankhorst: ENOSYS -> ENXIO, whitespace fixes]
2016-08-22 12:54:41 +02:00
Lyude
87660502f1 drm/i915/gen6+: Interpret mailbox error flags
In order to add proper support for the SAGV, we need to be able to know
what the cause of a failure to change the SAGV through the pcode mailbox
was. The reasoning for this is that some very early pre-release Skylake
machines don't actually allow you to control the SAGV on them, and
indicate an invalid mailbox command was sent.

This also might come in handy in the future for debugging.

Changes since v1:
 - Add functions for interpreting gen6 mailbox error codes along with
   gen7+ error codes, and actually interpret those codes properly
 - Renamed patch to reflect new behavior

Signed-off-by: Lyude <cpaul@redhat.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471463761-26796-2-git-send-email-cpaul@redhat.com
[mlankhorst: -ENOSYS -> -ENXIO for checkpatch]
2016-08-22 12:54:41 +02:00
Paulo Zanoni
1f061316cf drm/i915: Call intel_fbc_pre_update() after pinning the new pageflip
intel_fbc_pre_update() depends upon the new state being already pinned
in place in the Global GTT (primarily for both fencing which wants both
an offset and a fence register, if assigned). This requires the call to
intel_fbc_pre_update() be after intel_pin_and_fence_fb() - but commit
e8216e502a ("drm/i915/fbc: call intel_fbc_pre_update earlier during
page flips") moved the code way too much up in its attempt to call it
before the page flip.

v2 (from Paulo):
 - Point the original bad commit.
 - Add a comment to maybe prevent further regressions.

Fixes: e8216e502a ("drm/i915/fbc: call intel_fbc_pre_update earlier...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Patrik Jakobsson <patrik.jakobsson@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471462904-842-1-git-send-email-paulo.r.zanoni@intel.com
Cc: stable@vger.kernel.org
2016-08-22 11:28:58 +01:00
Daniel Vetter
c75870d86f drm/i915: Ensure consistent control flow __i915_gem_active_get_rcu
This issue here is (I think) purely theoretical, since a compiler
would need to be especially foolish to recompute the value of
i915_gem_request_completed right after it was already used. Hence the
additional barrier() is also not really a restriction.

But I believe this to be at least permissible, and since our rcu
trickery is a beast it's worth to annotate all the corner cases.
Chris proposed to instead just wrap a READ_ONCE around
request->fence.seqno in i915_gem_request_completed. But that has a
measurable impact on code size, and everywhere we hold a full
reference to the underlying request it's also not needed. And
personally I'd like to have just enough barriers and locking needed
for correctness, but not more - it makes it much easier in the future
to understand what's going on.

Since the busy ioctl has now fully embraced it's races there's no
point annotating it there too. We really only need it in
active_get_rcu, since that function _must_ deliver a correct snapshot
of the active fences (and not chase something else).

v2: Polish the comment a bit more (Chris).

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471856122-466-1-git-send-email-daniel.vetter@ffwll.ch
2016-08-22 11:15:09 +01:00
Chris Wilson
14daa63b27 drm/i915: Stop marking the unaccessible scratch page as UC
Since by design, if not entirely by practice, nothing is allowed to
access the scratch page we use to background fill the VM, then we do not
need to ensure that it is coherent between the CPU and GPU.
set_pages_uc() does a stop_machine() after changing the PAT, and that
significantly impacts upon context creation throughput.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-22 11:01:47 +01:00
Chris Wilson
5f4b091aa2 drm/i915: Restore debugfs/i915_gem_gtt back to its former glory
The passed in flag that distinguishes i915_gem_pin_display from
i915_gem_gtt is from node->info_ent->data not the data function
parameter.

Fixes: 6da8482936 ("drm/i915: Focus debugfs/i915_gem_pinned to show...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819115625.17688-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-22 10:32:38 +01:00
Daniel Vetter
d5d0804f8f drm/i915: Update DRIVER_DATE to 20160822
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-22 08:35:48 +02:00
Chris Wilson
c58305af18 drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass
Very old numbers indicate this is a 66% improvement when remapping the
entire object for fence contention - due to the elimination of
track_pfn_insert and its strcmp.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Testcase: igt/gem_fence_upload/performance
Testcase: igt/gem_mmap_gtt
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-6-chris@chris-wilson.co.uk
2016-08-19 17:13:36 +01:00
Chris Wilson
f7bbe7883c drm/i915: Embed the io-mapping struct inside drm_i915_private
As io_mapping.h now always allocates the struct, we can avoid that
allocation and extra pointer dance by embedding the struct inside
drm_i915_private

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk
2016-08-19 17:13:35 +01:00
Chris Wilson
8678fdaf39 drm/i915/fbc: Allow on unfenced surfaces, for recent gen
Only fbc1 is tied to using a fence. Later iterations of fbc are more
flexible and allow operation on unfenced frontbuffers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Zanoni, Paulo R" <paulo.r.zanoni@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-3-chris@chris-wilson.co.uk
2016-08-19 17:13:29 +01:00
Chris Wilson
12ecf4b979 drm/i915/fbc: Don't set an illegal fence if unfenced
If the frontbuffer doesn't have an associated fence, it will have a
fence reg of -1. If we attempt to OR in this register into the FBC
control register we end up setting all control bits, oops!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Zanoni, Paulo R" <paulo.r.zanoni@intel.com>
Reviwed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-2-chris@chris-wilson.co.uk
2016-08-19 16:59:36 +01:00
Chris Wilson
4fc788f5ee drm/i915: Flush delayed fence releases after reset
What I never hit in testing, but Mika immediately did, was a GPU hang
with a pending fence release (where a tiled object has been changed by
the user to be untiled, and the update has not yet been committed to the
fence register). As the stride/tiling is 0, this causes a divide-by-zero
error when trying to write the new fence parameters:

[   28.784518] drm/i915: Resetting chip after gpu hang
[   28.784551] divide error: 0000 [#1] PREEMPT SMP
[   28.784565] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat mxm_wmi x86_pkg_temp_thermal snd_hda_codec_hdmi kvm irqbypass snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec serio_raw snd_hwdep snd_hda_core snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_timer snd_seq_device snd soundcore mac_hid wmi efivarfs autofs4 raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid0 multipath linear psmouse e1000e ptp pps_core nvme nvme_core i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm video
[   28.784738] CPU: 0 PID: 1692 Comm: kworker/0:2 Not tainted 4.8.0-rc2+ #895
[   28.784752] Hardware name: System manufacturer System Product Name/Z170M-PLUS, BIOS 1803 05/09/2016
[   28.784786] Workqueue: events_long i915_hangcheck_elapsed [i915]
[   28.784814] task: ffff923c18f59d40 task.stack: ffff923c1b7e4000
[   28.784827] RIP: 0010:[<ffffffffc0475b5f>]  [<ffffffffc0475b5f>] fence_write+0x9f/0x3b0 [i915]
[   28.784854] RSP: 0018:ffff923c1b7e7b30  EFLAGS: 00010246
[   28.784866] RAX: 00000000008ca000 RBX: ffff923c18540000 RCX: 0000000000000020
[   28.784880] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000596d000
[   28.784894] RBP: ffff923c1b7e7b68 R08: 0000000000000000 R09: 0000000000000000
[   28.784908] R10: 0000000000000000 R11: 00000000008ca000 R12: ffff923c1ef9d600
[   28.784921] R13: 0000000000100040 R14: 0000000000100044 R15: ffff923c18549908
[   28.784935] FS:  0000000000000000(0000) GS:ffff923c36c00000(0000) knlGS:0000000000000000
[   28.784951] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.784962] CR2: 00007f193373c893 CR3: 0000000419c78000 CR4: 00000000003406f0
[   28.784976] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   28.784990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   28.785004] Stack:
[   28.785009]  000000000596c03b ffff923c1b7e7b68 ffff923c18549938 0000000000000009
[   28.785026]  ffff923c18540000 ffff923c18549280 ffff923c18547ce8 ffff923c1b7e7b90
[   28.785044]  ffffffffc04761f9 ffff923c18540000 ffff923c18547d00 ffff923c18548ff8
[   28.785062] Call Trace:
[   28.785078]  [<ffffffffc04761f9>] i915_gem_restore_fences+0x39/0x50 [i915]
[   28.785102]  [<ffffffffc047fe89>] i915_gem_reset+0x179/0x300 [i915]

Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Fixes: 49ef5294cd ("drm/i915: Move fence tracking from object to vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-1-chris@chris-wilson.co.uk
2016-08-19 16:59:22 +01:00
Dave Gordon
0be81156b3 drm/i915: Reattach comment, complete type specification
In the recent patch
bc3d674 drm/i915: Allow userspace to request no-error-capture upon ...
the final version moved the flags and the associated #defines around
so they were adjacent; unfortunately, they ended up between a comment
and the thing (hw_id) to which the comment applies :(

So this patch reshuffles the comment and subject back together.

Also, as we're touching 'hw_id', let's change it from just 'unsigned'
to a fully-specified 'unsigned int', because some code checking tools
(including checkpatch) object to plain 'unsigned'.

Fixes: bc3d674462 ("drm/i915: Allow userspace to request no-error-capture...")
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471616622-6919-1-git-send-email-david.s.gordon@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-19 15:31:17 +01:00
Chris Wilson
52a42cec4b drm/i915/cmdparser: Accelerate copies from WC memory
If we need to use clflush to prepare our batch for reads from memory, we
can bypass the cache instead by using non-temporal copies.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-39-chris@chris-wilson.co.uk
2016-08-18 22:37:01 +01:00
Chris Wilson
76ff480ec9 drm/i915/cmdparser: Use binary search for faster register lookup
A significant proportion of the cmdparsing time for some batches is the
cost to find the register in the mmiotable. We ensure that those tables
are in ascending order such that we could do a binary search if it was
ever merited. It is.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-38-chris@chris-wilson.co.uk
2016-08-18 22:37:01 +01:00
Chris Wilson
ea884f09e5 drm/i915/cmdparser: Check for SKIP descriptors first
If the command descriptor says to skip it, ignore checking for anyother
other conflict.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-37-chris@chris-wilson.co.uk
2016-08-18 22:37:00 +01:00
Chris Wilson
efdfd91f9b drm/i915/cmdparser: Compare against the previous command descriptor
On the blitter (and in test code), we see long sequences of repeated
commands, e.g. XY_PIXEL_BLT, XY_SCANLINE_BLT, or XY_SRC_COPY. For these,
we can skip the hashtable lookup by remembering the previous command
descriptor and doing a straightforward compare of the command header.
The corollary is that we need to do one extra comparison before lookup
up new commands.

v2: Less magic mask (ok, it is still magic, but now you cannot see!)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-36-chris@chris-wilson.co.uk
2016-08-18 22:37:00 +01:00
Chris Wilson
d6a4ead7a3 drm/i915/cmdparser: Improve hash function
The existing code's hashfunction is very suboptimal (most 3D commands
use the same bucket degrading the hash to a long list). The code even
acknowledge that the issue was known and the fix simple:

/*
 * If we attempt to generate a perfect hash, we should be able to look at bits
 * 31:29 of a command from a batch buffer and use the full mask for that
 * client. The existing INSTR_CLIENT_MASK/SHIFT defines can be used for this.
 */

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-35-chris@chris-wilson.co.uk
2016-08-18 22:36:59 +01:00
Chris Wilson
ed13033f02 drm/i915/cmdparser: Only cache the dst vmap
For simplicity, we want to continue using a contiguous mapping of the
command buffer, but we can reduce the number of vmappings we hold by
switching over to a page-by-page copy from the user batch buffer to the
shadow. The cost for saving one linear mapping is about 5% in trivial
workloads - which is more or less the overhead in calling kmap_atomic().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-34-chris@chris-wilson.co.uk
2016-08-18 22:36:59 +01:00
Chris Wilson
0b5372727b drm/i915/cmdparser: Use cached vmappings
The single largest factor in the overhead of parsing the commands is the
setup of the virtual mapping to provide a continuous block for the batch
buffer. If we keep those vmappings around (against the better judgement
of mm/vmalloc.c, which we offset by handwaving and looking suggestively
at the shrinker) we can dramatically improve the performance of the
parser for small batches (such as media workloads). Furthermore, we can
use the prepare shmem read/write functions to determine  how best we
need to clflush the range (rather than every page of the object).

The impact of caching both src/dst vmaps is +80% on ivb and +140% on byt
for the throughput on small batches. (Caching just the dst vmap and
iterating over the src, doing a page by page copy is roughly 5% slower
on both platforms. That may be an acceptable trade-off to eliminate one
cached vmapping, and we may be able to reduce the per-page copying overhead
further.) For *this* simple test case, the cmdparser is now within a
factor of 2 of ideal performance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-33-chris@chris-wilson.co.uk
2016-08-18 22:36:59 +01:00
Chris Wilson
068715b922 drm/i915/cmdparser: Add the TIMESTAMP register for the other engines
Since I have been using the BCS_TIMESTAMP to measure latency of
execution upon the blitter ring, allow regular userspace to also read
from that register. They are already allowed RCS_TIMESTAMP!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-32-chris@chris-wilson.co.uk
2016-08-18 22:36:58 +01:00
Chris Wilson
7756e45407 drm/i915/cmdparser: Make initialisation failure non-fatal
If the developer adds a register in the wrong order, we BUG during boot.
That makes development and testing very difficult. Let's be a bit more
friendly and disable the command parser with a big warning if the tables
are invalid.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-31-chris@chris-wilson.co.uk
2016-08-18 22:36:58 +01:00
Chris Wilson
cd3127d684 drm/i915: Stop discarding GTT cache-domain on unbind vma
Since commit 43566dedde ("drm/i915: Broaden application of
set-domain(GTT)") we allowed objects to be in the GTT domain, but unbound.
Therefore removing the GTT cache domain when removing the GGTT vma is no
longer semantically correct.

An unfortunate side-effect is we lose the wondrously named
i915_gem_object_finish_gtt(), not to be confused with
i915_gem_gtt_finish_object()!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-30-chris@chris-wilson.co.uk
2016-08-18 22:36:57 +01:00
Chris Wilson
383d5823e9 drm/i915: Bump the inactive tracking for all VMA accessed
We track the LRU access for eviction and bump the last access for the
user GGTT on set-to-gtt. When we do so we need to not only bump the
primary GGTT VMA but all partials as well. Similarly we want to
bump the last access tracking for when unpinning an object from the
scanout so that they do not get promptly evicted and hopefully remain
available for reuse on the next frame.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-29-chris@chris-wilson.co.uk
2016-08-18 22:36:57 +01:00
Chris Wilson
d8923dcfa5 drm/i915: Track display alignment on VMA
When using the aliasing ppgtt and pageflipping with the shrinker/eviction
active, we note that we often have to rebind the backbuffer before
flipping onto the scanout because it has an invalid alignment. If we
store the worst-case alignment required for a VMA, we can avoid having
to rebind at critical junctures.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-28-chris@chris-wilson.co.uk
2016-08-18 22:36:56 +01:00
Chris Wilson
2efb813d53 drm/i915: Fallback to using unmappable memory for scanout
The existing ABI says that scanouts are pinned into the mappable region
so that legacy clients (e.g. old Xorg or plymouthd) can write directly
into the scanout through a GTT mapping. However if the surface does not
fit into the mappable region, we are better off just trying to fit it
anywhere and hoping for the best. (Any userspace that is capable of
using ginormous scanouts is also likely not to rely on pure GTT
updates.) With the partial vma fault support, we are no longer
restricted to only using scanouts that we can pin (though it is still
preferred for performance reasons and for powersaving features like
FBC).

v2: Skip fence pinning when not mappable.
v3: Add a comment to explain the possible ramifications of not being
    able to use fences for unmappable scanouts.
v4: Rebase to skip over some local patches
v5: Rebase to defer until after we have unmappable GTT fault support

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-27-chris@chris-wilson.co.uk
2016-08-18 22:36:56 +01:00
Chris Wilson
821188778b drm/i915: Choose not to evict faultable objects from the GGTT
Often times we do not want to evict mapped objects from the GGTT as
these are quite expensive to teardown and frequently reused (causing an
equally, if not more so, expensive setup). In particular, when faulting
in a new object we want to avoid evicting an active object, or else we
may trigger a page-fault-of-doom as we ping-pong between evicting two
objects.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-26-chris@chris-wilson.co.uk
2016-08-18 22:36:55 +01:00
Chris Wilson
50349247ea drm/i915: Drop ORIGIN_GTT for untracked GTT writes
If FBC is set on a framebuffer that is unmapped, all GTT faults will be
from a partial mapping. Writes by the user through the partial VMA are
then untracked by the FBC and so we must use the ORIGIN_CPU when flushing
the I915_GEM_DOMAIN_GTT.

v2: Keep ORIGIN_CPU for set-to-domain(.write=CPU)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: "Zanoni, Paulo R" <paulo.r.zanoni@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-25-chris@chris-wilson.co.uk
2016-08-18 22:36:55 +01:00
Chris Wilson
aa136d9d72 drm/i915: Convert partial ggtt vma to full ggtt if it spans the entire object
If we want to create a partial vma from a chunk that is the same size as
the object, create a normal ggtt vma instead. The benefit is that it
will match future requests for the normal ggtt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-24-chris@chris-wilson.co.uk
2016-08-18 22:36:51 +01:00
Chris Wilson
a61007a83a drm/i915: Fix partial GGTT faulting
We want to always use the partial VMA as a fallback for a failure to
bind the object into the GGTT. This extends the support partial objects
in the GGTT to cover everything, not just objects too large.

v2: Call the partial view, view not partial.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-23-chris@chris-wilson.co.uk
2016-08-18 22:36:51 +01:00
Chris Wilson
03af84fe7f drm/i915: Choose partial chunksize based on tile row size
In order to support setting up fences for partial mappings of an object,
we have to align those mappings with the fence. The minimum chunksize we
choose is at least the size of a single tile row.

v2: Make minimum chunk size a define for later use

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-22-chris@chris-wilson.co.uk
2016-08-18 22:36:50 +01:00
Chris Wilson
49ef5294cd drm/i915: Move fence tracking from object to vma
In order to handle tiled partial GTT mmappings, we need to associate the
fence with an individual vma.

v2: A couple of silly drops replaced spotted by Joonas

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk
2016-08-18 22:36:50 +01:00
Chris Wilson
a1e5afbe4d drm/i915: Rename fence.lru_list to link
Our current practice is to only name the actual list (here
dev_priv->fence_list) using "list", and elements upon that list are
referred to as "link". Further, the lru nature is of the list and not of
the node and including in the name does not disambiguate the link from
anything else.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-20-chris@chris-wilson.co.uk
2016-08-18 22:36:49 +01:00
Chris Wilson
364c8172ed drm/i915/userptr: Make gup errors stickier
Keep any error reported by the gup_worker until we are notified that the
arena has changed (via the mmu-notifier). This has the importance of
making two consecutive calls to i915_gem_object_get_pages() reporting
the same error, and curtailing a loop of detecting a fault and requeueing
a gup_worker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-19-chris@chris-wilson.co.uk
2016-08-18 22:36:49 +01:00
Chris Wilson
c58b735fc7 drm/i915: Allocate rings from stolen
If we have stolen available, make use of it for ringbuffer allocation.
Previously this was restricted to !llc platforms, as writing to stolen
requires a GGTT mapping - but now that we have partial mappable support,
the mappable aperture isn't quite so precious so we can use it more
freely and ringbuffers are a good user for the otherwise wasted stolen.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-18-chris@chris-wilson.co.uk
2016-08-18 22:36:49 +01:00
Chris Wilson
9d80841ea4 drm/i915: Allow ringbuffers to be bound anywhere
Now that we have WC vmapping available, we can bind our rings anywhere
in the GGTT and do not need to restrict them to the mappable region.
Except for stolen objects, for which direct access is verbatim and we
must use the mappable aperture.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-17-chris@chris-wilson.co.uk
2016-08-18 22:36:48 +01:00
Chris Wilson
05a20d098d drm/i915: Move map-and-fenceable tracking to the VMA
By moving map-and-fenceable tracking from the object to the VMA, we gain
fine-grained tracking and the ability to track individual fences on the VMA
(subsequent patch).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk
2016-08-18 22:36:48 +01:00
Chris Wilson
9e53d9be0d drm/i915: Disallow direct CPU access to stolen pages for relocations
As we cannot access the backing pages behind stolen objects, we should
not attempt to do so for relocations.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-15-chris@chris-wilson.co.uk
2016-08-18 22:36:47 +01:00
Chris Wilson
e8cb909ac3 drm/i915: Fallback to single page GTT mmappings for relocations
If we cannot pin the entire object into the mappable region of the GTT,
try to pin a single page instead. This is much more likely to succeed,
and prevents us falling back to the clflush slow path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-14-chris@chris-wilson.co.uk
2016-08-18 22:36:47 +01:00
Chris Wilson
d50415cc6c drm/i915: Refactor execbuffer relocation writing
With the introduction of the reloc page cache, we are just one step away
from refactoring the relocation write functions into one. Not only does
it tidy the code (slightly), but it greatly simplifies the control logic
much to gcc's satisfaction.

v2: Add selftests to document the relationship between the clflush
flags, the KMAP bit and packing into the page-aligned pointer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-13-chris@chris-wilson.co.uk
2016-08-18 22:36:46 +01:00
Chris Wilson
b0dc465f95 drm/i915: Tidy up flush cpu/gtt write domains
Since we know the write domain, we can drop the local variable and make
the code look a tiny bit simpler.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-12-chris@chris-wilson.co.uk
2016-08-18 22:36:46 +01:00
Chris Wilson
9764951e7f drm/i915: Pin the pages first in shmem prepare read/write
There is an improbable, but not impossible, case that if we leave the
pages unpin as we operate on the object, then somebody via the shrinker
may steal the lock (which lock? right now, it is struct_mutex, THE lock)
and change the cache domains after we have already inspected them.

(Whilst here, avail ourselves of the opportunity to take a couple of
steps to make the two functions look more similar.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-11-chris@chris-wilson.co.uk
2016-08-18 22:36:45 +01:00
Chris Wilson
3b5724d702 drm/i915: Wait for writes through the GTT to land before reading back
If we quickly switch from writing through the GTT to a read of the
physical page directly with the CPU (e.g. performing relocations through
the GTT and then running the command parser), we can observe that the
writes are not visible to the CPU. It is not a coherency problem, as
extensive investigations with clflush have demonstrated, but a mere
timing issue - we have to wait for the GTT to complete it's write before
we start our read from the CPU.

The issue can be illustrated in userspace with:

	gtt = gem_mmap__gtt(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
	cpu = gem_mmap__cpu(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE);
	gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);

	for (i = 0; i < OBJECT_SIZE / 64; i++) {
		int x = 16*i + (i%16);
		gtt[x] = i;
		clflush(&cpu[x], sizeof(cpu[x]));
		assert(cpu[x] == i);
	}

Experimenting with that shows that this behaviour is indeed limited to
recent Atom-class hardware.

Testcase: igt/gem_exec_flush/basic-batch-default-cmd #byt
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-10-chris@chris-wilson.co.uk
2016-08-18 22:36:45 +01:00
Chris Wilson
a314d5cb4a drm/i915: Before accessing an object via the cpu, flush GTT writes
If we want to read the pages directly via the CPU, we have to be sure
that we have to flush the writes via the GTT (as the CPU can not see
the address aliasing).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-9-chris@chris-wilson.co.uk
2016-08-18 22:36:44 +01:00
Chris Wilson
43394c7d0d drm/i915: Extract i915_gem_obj_prepare_shmem_write()
This is a companion to i915_gem_obj_prepare_shmem_read() that prepares
the backing storage for direct writes. It first serialises with the GPU,
pins the backing storage and then indicates what clfushes are required in
order for the writes to be coherent.

Whilst here, fix support for ancient CPUs without clflush for which we
cannot do the GTT+clflush tricks.

v2: Add i915_gem_obj_finish_shmem_access() for symmetry

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-8-chris@chris-wilson.co.uk
2016-08-18 22:36:44 +01:00
Chris Wilson
31a39207f0 drm/i915: Cache kmap between relocations
When doing relocations, we have to obtain a mapping to the page
containing the target address. This is either a kmap or iomap depending
on GPU and its cache coherency. Neighbouring relocation entries are
typically within the same page and so we can cache our kmapping between
them and avoid those pesky TLB flushes.

Note that there is some sleight-of-hand in how the slow relocate works
as the reloc_entry_cache implies pagefaults disabled (as we are inside a
kmap_atomic section). However, the slow relocate code is meant to be the
fallback from the atomic fast path failing. Fortunately it works as we
already have performed the copy_from_user for the relocation array (no
more pagefaults there) and the kmap_atomic cache is enabled after we
have waited upon an active buffer (so no more sleeping in atomic).
Magic!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-7-chris@chris-wilson.co.uk
2016-08-18 22:36:43 +01:00
Chris Wilson
1803458478 drm/i915: Fallback to single page pwrite/pread if unable to release fence
If we cannot release the fence (for example if someone is inexplicably
trying to write into a tiled framebuffer that is currently pinned to the
display! *cough* kms_frontbuffer_tracking *cough*) fallback to using the
page-by-page pwrite/pread interface, rather than fail the syscall
entirely.

Since this is triggerable by the user (along pwrite) we have to remove
the WARN_ON(fence->pin_count).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-6-chris@chris-wilson.co.uk
2016-08-18 22:36:43 +01:00
Chris Wilson
d243ad8202 drm/i915: Mark up the GTT flush following WC writes as ORIGIN_CPU
Similarly to invalidating beforehand, if the object is mmapped via
I915_MMAP_WC we cannot track writes through the I915_GEM_DOMAIN_GTT. At
the conclusion of the write, i915_gem_object_flush_gtt_writes() we also
need to treat the origin carefully in case it may have been untracked.

See also commit aeecc9696a ("drm/i915: use ORIGIN_CPU for frontbuffer
invalidation on WC mmaps").

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-5-chris@chris-wilson.co.uk
2016-08-18 22:36:43 +01:00
Chris Wilson
b19482d7ce drm/i915: Use ORIGIN_CPU for fb invalidation from pwrite
As pwrite does not use the fence for its GTT access, and may even go
through a secondary interface avoiding the main VMA, we cannot treat the
write as automatically invalidated by the hardware and so we require
ORIGIN_CPU frontbufer invalidate/flushes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-4-chris@chris-wilson.co.uk
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-18 22:36:24 +01:00
Chris Wilson
4b30cb2334 drm/i915: vfree() no longer ignores the low bits of the address
Since vfree() now likes to WARN when passed a non-page-aligned pointer,
we need to discard the low bits to comply with it.

Fixes: d31d7cb146 ("drm/i915: Support for creating write combined type vmaps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-3-chris@chris-wilson.co.uk
2016-08-18 22:36:23 +01:00
Chris Wilson
600f436801 drm/i915: Unconditionally flush any chipset buffers before execbuf
If userspace is asynchronously streaming into the batch or other
execobjects, we may not flush those writes along with a change in cache
domain (as there is no change). Therefore those writes may end up in
internal chipset buffers and not visible to the GPU upon execution. We
must issue a flush command or otherwise we encounter incoherency in the
batchbuffers and the GPU executing invalid commands (i.e. hanging) quite
regularly.

v2: Throw a paranoid wmb() into the general flush so that we remain
consistent with before.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90841
Fixes: 1816f92363 ("drm/i915: Support creation of unbound wc user...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Tested-by: Matti Hämäläinen <ccr@tnsp.org>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-1-chris@chris-wilson.co.uk
2016-08-18 22:36:15 +01:00
Matt Roper
43aa7e8750 drm/i915/gen9: Drop invalid WARN() during data rate calculation
It's possible to have a non-zero plane mask and still wind up with a
total data rate of zero.  There are two cases where this can happen:

 * planes are active (from the KMS point of view), but are
   all fully clipped (positioned offscreen)
 * the only active plane on a CRTC is the cursor (which is handled
   independently and not counted into the general data rate computations

These are both valid display setups (although unusual), so we need to
drop the WARN().

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Testcase: kms_universal_planes.cursor-only-pipe-*
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466196140-16336-4-git-send-email-matthew.d.roper@intel.com
Cc: stable@vger.kernel.org #v4.7+
2016-08-18 14:26:54 +02:00
Matt Roper
1b54a880b2 drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2)
intel_state->active_crtcs is usually only initialized when doing a
modeset.  During our first atomic commit after boot, we're effectively
faking a modeset to sanitize the DDB/wm setup, so ensure that this field
gets initialized before use.

v2:
 - Don't clobber active_crtcs if our first commit really is a modeset
   (Maarten)
 - Grab connection_mutex when faking a modeset during sanitization
   (Maarten)

Reported-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466196140-16336-2-git-send-email-matthew.d.roper@intel.com
Cc: stable@vger.kernel.org #v4.7+
2016-08-18 14:26:41 +02:00
Chris Wilson
ceae5317f6 drm/i915: Add missing kerneldoc for guc_client_alloc:engines
Brief parameter text to describe @engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471437762-22936-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-17 14:25:49 +01:00
Chris Wilson
dbaf788e71 drm/i915: Remember to set vma->pages for the preallocated stolen object
In commit 247177ddd5 ("drm/i915: Always set the vma->pages"), as it
title implies, we always set vma->pages for bound objects. Even before
that, we would set vma->ggtt_view.pages, for globally bound objects.
This was forgotten for the fixup inside the preallocated stolen objects,
which has to recreate a global GTT binding outside of the usual VMA
insertion path

Fixes: 247177ddd5 ("drm/i915: Always set the vma->pages")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471430013-3449-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-17 13:55:08 +01:00
Chris Wilson
24808e9679 drm/i915: Mark i915_hpd_poll_init_work as static
Local function with forgotten static declaration.

Fixes: 19625e85c6 ("drm/i915: Enable polling when we don't have hpd")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Lyude <cpaul@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471432146-5196-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-17 12:36:15 +01:00
Chris Wilson
21aea5cc06 drm/i915: Mark the static key for movntqda as static
Silence sparse who warns that the global variable is not declared
static.

Fixes: 0b1de5d58e ("drm/i915: Use SSE4.1 movntdqa to ...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471432146-5196-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-08-17 12:36:07 +01:00
Tvrtko Ursulin
318f89ca20 drm/i915: Initialize legacy semaphores from engine hw id indexed array
Build the legacy semaphore initialisation array using the engine
hardware ids instead of driver internal ones. This makes the
static array size dependent only on the number of gen6 semaphore
engines.

Also makes the per-engine semaphore wait and signal tables
hardware id indexed saving some more space.

v2: Refactor I915_GEN6_NUM_ENGINES to GEN6_SEMAPHORE_LAST. (Chris Wilson)
v3: More polish. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471363461-9973-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-08-17 11:29:56 +01:00
Tvrtko Ursulin
5ec2cf7e34 drm/i915: Add enum for hardware engine identifiers
Put the engine hardware id in the common header so they are
not only associated with the GuC since they are needed for
the legacy semaphores implementation.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-17 11:29:56 +01:00
Chris Wilson
ca99d8781f drm/i915: Silence GCC warning for cmn_a_well
Just make the logic simple enough for even GCC to understand (and
foolproof against random changes):

drivers/gpu/drm/i915/intel_runtime_pm.c: warning: 'cmn_a_well' may be
used uninitialized in this function [-Wuninitialized]:  => 871:23

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471284383-22324-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Imre Deak <imre.deak@intel.com>
2016-08-16 10:53:29 +01:00
Chris Wilson
1255501d86 drm/i915: Embrace the race in busy-ioctl
Daniel Vetter proposed a new challenge to the serialisation inside the
busy-ioctl that exposed a flaw that could result in us reporting the
wrong engine as being busy. If the request is reallocated as we test
its busyness and then reassigned to this object by another thread, we
would not notice that the test itself was incorrect.

We are faced with a choice of using __i915_gem_active_get_request_rcu()
to first acquire a reference to the request preventing the race, or to
acknowledge the race and accept the limitations upon the accuracy of the
busy flags. Note that we guarantee that we never falsely report the
object as idle (providing userspace itself doesn't race), and so the
most important use of the busy-ioctl and its guarantees are fulfilled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471337440-16777-1-git-send-email-chris@chris-wilson.co.uk
2016-08-16 10:35:02 +01:00
Chris Wilson
1544c42ed9 drm/i915: Initialise mmaped_count for i915_gem_object_info
Reported-by: 0day kbuild test robot
Fixes: 2bd160a131 ("drm/i915: Reduce i915_gem_objects to only show...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471263496-27537-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-15 14:02:41 +01:00
Chris Wilson
21a2c58a9c drm/i915: Record the RING_MODE register for post-mortem debugging
Just another useful register to inspect following a GPU hang.

v2: Remove partial decoding of RING_MODE to userspace, be consistent and
use GEN > 2 guards around RING_MODE everywhere.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-32-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:19 +01:00
Chris Wilson
57bc699d43 drm/i915: Only record active and pending requests upon a GPU hang
There is no other state pertaining to the completed requests in the
hang, other than gleamed through the ringbuffer, so including the
expired requests in the list of outstanding requests simply adds noise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-31-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:18 +01:00
Chris Wilson
03382dfb63 drm/i915: Print the batchbuffer offset next to BBADDR in error state
It is useful when looking at captured error states to check the recorded
BBADDR register (the address of the last batchbuffer instruction loaded)
against the expected offset of the batch buffer, and so do a quick check
that (a) the capture is true or (b) HEAD hasn't wandered off into the
badlands.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-30-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:16 +01:00
Chris Wilson
c84455b4ba drm/i915: Move debug only per-request pid tracking from request to ctx
Since contexts are not currently shared between userspace processes, we
have an exact correspondence between context creator and guilty batch
submitter. Therefore we can save some per-batch work by inspecting the
context->pid upon error instead. Note that we take the context's
creator's pid rather than the file's pid in order to better track fd
passed over sockets.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-29-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:15 +01:00
Chris Wilson
bde13ebdab drm/i915: Introduce i915_ggtt_offset()
This little helper only exists to safely discard the upper unused 32bits
of the general 64-bit VMA address - as we know that all Global GTT
currently are less than 4GiB in size and so that the upper bits must be
zero. In many places, we use a u32 for the global GTT offset and we want
to document where we are discarding the full VMA offset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-28-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:14 +01:00
Chris Wilson
058d88c433 drm/i915: Track pinned VMA
Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:13 +01:00
Chris Wilson
19880c4a3f drm/i915: Consolidate i915_vma_unpin_and_release()
In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:12 +01:00
Chris Wilson
48bb74e48b drm/i915: Use VMA for wa_ctx tracking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-25-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:11 +01:00
Chris Wilson
a5e85c8a4d drm/i915: Use VMA for render state page tracking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-24-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:11 +01:00
Chris Wilson
51d545d026 drm/i915: Use VMA as the primary tracker for semaphore page
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-23-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:10 +01:00
Chris Wilson
9b3b7841b8 drm/i915/overlay: Use VMA as the primary tracker for images
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-22-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:09 +01:00
Chris Wilson
57f275a22b drm/i915: Move common seqno reset to intel_engine_cs.c
Since the intel_engine_init_seqno() is shared by all engine submission
backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-21-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:08 +01:00
Chris Wilson
adc320c4b7 drm/i915: Move common scratch allocation/destroy to intel_engine_cs.c
Since the scratch allocation and cleanup is shared by all engine
submission backends, move it out of the legacy intel_ringbuffer.c and
into the new home for common routines, intel_engine_cs.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-20-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:08 +01:00
Chris Wilson
56c0f1a7c1 drm/i915: Use VMA for scratch page tracking
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-19-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:07 +01:00
Chris Wilson
57e8853181 drm/i915: Use VMA for ringbuffer tracking
Use the GGTT VMA as the primary cookie for handing ring objects as
the most common action upon the ring is mapping and unmapping which act
upon the VMA itself. By restructuring the code to work with the ring
VMA, we can shrink the code and remove a few cycles from context pinning.

v2: Move the flush of the object back to before the first pin. We use
the am-I-bound? query to only have to check the flush on the first
bind and so avoid stalling on active rings.
Lots of little renames and small hoops.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-18-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:05 +01:00
Chris Wilson
e5cdb22b27 drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
Access through the GTT requires the device to be awake. Ideally
i915_vma_pin_iomap() is short-lived and the pinning demarcates the
access through the iomap. This is not entirely true, we have a mixture
of long lived pins that exceed the wakelock (such as legacy ringbuffers)
and short lived pin that do live within the wakelock (such as execlist
ringbuffers).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-17-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:04 +01:00
Chris Wilson
7abc98fadf drm/i915: Only change the context object's domain when binding
We know that the only access to the context object is via the GPU, and
the only time when it can be out of the GPU domain is when it is swapped
out and unbound. Therefore we only need to clflush the object when
binding, thus avoiding any potential stall on touching the domain on an
active context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-16-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:03 +01:00
Chris Wilson
bf3783e52a drm/i915: Use VMA as the primary object for context state
When working with contexts, we most frequently want the GGTT VMA for the
context state, first and foremost. Since the object is available via the
VMA, we need only then store the VMA.

v2: Formatting tweaks to debugfs output, restored some comments removed
in the next patch

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-15-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:02 +01:00
Chris Wilson
f23eda8cb4 drm/i915: Use VMA directly for checking tiling parameters
v2: Rename functions to suit their more active role

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-14-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:01 +01:00
Chris Wilson
a83718b681 drm/i915: Convert fence computations to use vma directly
Lookup the GGTT vma once for the object assigned to the fence, and then
derive everything from that vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-13-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:00 +01:00
Chris Wilson
8b797af189 drm/i915: Track pinned vma inside guc
Since the guc allocates and pins and object into the GGTT for its usage,
it is more natural to use that pinned VMA as our resource cookie.

v2: Embrace naming tautology
v3: Rewrite comments for guc_allocate_vma()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-12-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:58 +01:00
Chris Wilson
624192cfd3 drm/i915: Add convenience wrappers for vma's object get/put
The VMA are unreferenced, they belong to the object and live until they
are closed. However, if we want to use the VMA as a cookie and use it to
keep the object alive, we want to hold onto a reference to the object
for the lifetime of the VMA cookie. To facilitate this, add a couple of
simple wrappers for managing the reference count on the object owning the
VMA.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-11-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:57 +01:00
Chris Wilson
78ef2d9aba drm/i915: Add fetch_and_zero() macro
A simple little macro to clear a pointer and return the old value. This
is useful for writing

	value = *ptr;
	if (!value)
		return;

	*ptr = 0;
	...
	free(value);

in a slightly more concise form:

	value = fetch_and_zero(ptr);
	if (!value)
		return;

	...
	free(value);

with the idea that this establishes a pattern that may be extended for
atomic use (using xchg or cmpxchg) i.e. atomic_fetch_and_zero() and
similar to llist.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-10-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:56 +01:00
Chris Wilson
81a8aa4a6c drm/i915: Create a VMA for an object
In many places, we wish to store the VMA in preference to the object
itself and so being able to create the persistent VMA is useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-9-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:55 +01:00
Chris Wilson
247177ddd5 drm/i915: Always set the vma->pages
Previously, we would only set the vma->pages pointer for GGTT entries.
However, if we always set it, we can use it to prettify some code that
may want to access the backing store associated with the VMA (as
assigned to the VMA).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-8-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:54 +01:00
Chris Wilson
95b2ab56a5 drm/i915: Remove redundant WARN_ON from __i915_add_request()
It's an outright programming error, so explode if it is ever hit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-7-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:53 +01:00
Chris Wilson
2bd160a131 drm/i915: Reduce i915_gem_objects to only show object information
No longer is knowing how much of the GTT (both mappable aperture and
beyond) relevant, and the output clutters the real information - that is
how many objects are allocated and bound (and by who) so that we can
quickly grasp if there is a leak.

v2: Relent, and rename pinned to indicate display only. Since the
display objects are semi-static and are of variable size, they are the
interesting objects to watch over time for aperture leaking. The other
pins are either static (such as the scratch page) or very short lived
(such as execbuf) and not part of the precious GGTT.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-6-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:52 +01:00
Chris Wilson
6da8482936 drm/i915: Focus debugfs/i915_gem_pinned to show only display pins
Only those objects pinned to the display have semi-permanent pins of a
global nature (other pins are transient within their local vm). Simplify
i915_gem_pinned to only show the pertinent information about the pinned
objects within the GGTT.

v2: i915_gem_gtt_info is still shared with debugfs/i915_gem_gtt,
rename i915_gem_pinned to i915_gem_pin_display to better reflect its
contents

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-5-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:52 +01:00
Chris Wilson
61fb00d6ea drm/i915: Remove inactive/active list from debugfs
These two files (i915_gem_active, i915_gem_inactive) no longer give
pertinent information since active/inactive tracking is per-vm and so we
need the information per-vm. They are obsolete so remove them.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-4-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:51 +01:00
Chris Wilson
546b1b6a40 drm/i915: Store the active context object on all engines upon error
With execlists, we have context objects everywhere, not just RCS. So
store them for post-mortem debugging. This also has a secondary effect
of removing one more unsafe list iteration with using preserved state
from the hanging request. And now we can cross-reference the request's
context state with that loaded by the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-3-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:50 +01:00
Chris Wilson
c0ce466361 drm/i915: Reduce amount of duplicate buffer information captured on error
When capturing the error state, we do not need to know about every
address space - just those that are related to the error. We know which
context is active at the time, therefore we know which VM are implicated
in the error. We can then restrict the VM which we report to the
relevant subset.

v2: s/i/count_active/ (and similar)
    Rewrite label generation for "Buffers"

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-2-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:49 +01:00
Chris Wilson
d045446df1 drm/i915: Record the position of the start of the request
Not only does it make for good documentation and debugging aide, but it is
also vital for when we want to unwind requests - such as when throwing away
an incomplete request.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1470414607-32453-2-git-send-email-arun.siluvery@linux.intel.com
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-1-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:49 +01:00
Chris Wilson
7466c291b1 drm/i915: Show RPS autotuning thresholds along with waitboost
For convenience when debugging user issues show the autotuning
RPS parameters in debugfs/i915_rps_boost_info.

v2: Refine the presentation
v3: Style

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: fritsch@kodi.tv
Link: http://patchwork.freedesktop.org/patch/msgid/1471181336-27523-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471250973-31277-1-git-send-email-chris@chris-wilson.co.uk
2016-08-15 10:03:46 +01:00
Daniel Vetter
cc9263874b Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Backmerge because too many conflicts, and also we need to get at the
latest struct fence patches from Gustavo. Requested by Chris Wilson.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-08-15 10:41:47 +02:00
Dave Airlie
fc93ff608b Merge tag 'drm-intel-next-2016-08-08' of git://anongit.freedesktop.org/drm-intel into drm-next
- refactor ddi buffer programming a bit (Ville)
- large-scale renaming to untangle naming in the gem code (Chris)
- rework vma/active tracking for accurately reaping idle mappings of shared
  objects (Chris)
- misc dp sst/mst probing corner case fixes (Ville)
- tons of cleanup&tunings all around in gem
- lockless (rcu-protected) request lookup, plus use it everywhere for
  non(b)locking waits (Chris)
- pipe crc debugfs fixes (Rodrigo)
- random fixes all over

* tag 'drm-intel-next-2016-08-08' of git://anongit.freedesktop.org/drm-intel: (222 commits)
  drm/i915: Update DRIVER_DATE to 20160808
  drm/i915: fix aliasing_ppgtt leak
  drm/i915: Update comment before i915_spin_request
  drm/i915: Use drm official vblank_no_hw_counter callback.
  drm/i915: Fix copy_to_user usage for pipe_crc
  Revert "drm/i915: Track active streams also for DP SST"
  drm/i915: fix WaInsertDummyPushConstPs
  drm/i915: Assert that the request hasn't been retired
  drm/i915: Repack fence tiling mode and stride into a single integer
  drm/i915: Document and reject invalid tiling modes
  drm/i915: Remove locking for get_tiling
  drm/i915: Remove pinned check from madvise ioctl
  drm/i915: Reduce locking inside swfinish ioctl
  drm/i915: Remove (struct_mutex) locking for busy-ioctl
  drm/i915: Remove (struct_mutex) locking for wait-ioctl
  drm/i915: Do a nonblocking wait first in pread/pwrite
  drm/i915: Remove unused no-shrinker-steal
  drm/i915: Tidy generation of the GTT mmap offset
  drm/i915/shrinker: Wait before acquiring struct_mutex under oom
  drm/i915: Simplify do_idling() (Ironlake vt-d w/a)
  ...
2016-08-15 16:53:57 +10:00
Dave Airlie
f8725ad1da Merge tag 'topic/drm-misc-2016-08-12' of git://anongit.freedesktop.org/drm-intel into drm-next
- more fence destaging and cleanup (Gustavo&Sumit)
- DRIVER_LEGACY to untangle from DRIVER_MODESET
- drm_mm refactor (Chris)
- fbdev-less compile fies
- clipped plane src/dst rects (Ville)
- + a few mediatek patches that build on top of that (Bibby+Daniel)
- small stuff all over really

* tag 'topic/drm-misc-2016-08-12' of git://anongit.freedesktop.org/drm-intel: (43 commits)
  dma-buf/fence: kerneldoc: remove spurious section header
  dma-buf/fence: kerneldoc: remove unused struct members
  Revert "gpu: drm: omapdrm: dss-of: add missing of_node_put after calling of_parse_phandle"
  drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION
  drm/radeon|amgpu: Make fbdev emulation optional
  drm/vmwgfx: select CONFIG_FB
  drm: Remove superflous linux/fb.h includes
  drm/fb-helper: Add a dummy remove_conflicting_framebuffers
  dma-buf/sync_file: only enable fence signalling on poll()
  Documentation: add doc for sync_file_get_fence()
  dma-buf/sync_file: add sync_file_get_fence()
  dma-buf/sync_file: refactor fence storage in struct sync_file
  dma-buf/fence-array: add fence_is_array()
  drm/dp_helper: Rate limit timeout errors from drm_dp_i2c_do_msg()
  drm/dp_helper: Print first error received on failure in drm_dp_dpcd_access()
  drm: Add ratelimited versions of the DRM_DEBUG* macros
  drm: Make sure drm_vblank_no_hw_counter isn't abused
  drm/mediatek: Fix mtk_atomic_complete for runtime_pm
  drm/mediatek: plane: Use FB's format's cpp to compute x offset
  drm/mediatek: plane: Merge mtk_plane_enable into mtk_plane_atomic_update
  ...
2016-08-15 16:46:36 +10:00
Dave Airlie
a02b5a155e imx-drm updates and encoder atomic_mode_set helper callback
- add pixel clock and DE polarity configuration from device tree
   using display timing bindings for parallel and LVDS output
 - cleanup/remove trivial functions
 - cleanup and fixes in preparation for capture support
 - add atomic_mode_set helper and use it in imx-ldb - this is an
   alternative to the encoder mode_set callback that passes the
   crtc and connector state instead of just the mode. It allows
   drivers to get information from the attached connector without
   having to iterate over all connectors
 - add drm_bridge support to imx-ldb, for bridges attached via LVDS
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJXrXKYAAoJEFDCiBxwnmDrXocQAJVTaIbgwsq4G5CH3X3M3dRu
 vpIu71k2ww1YLktmE25AkIWKpMAdVQl+DwU3ARwqPtU6QRayolrYNgpmZAUzTy1a
 jk1CBYbhD+mi9uDaWo7HirGdClb0zs6tL1S56VnCA5AD36nix0ho6zkIutzxKFY6
 mFRFcIQPYbSykffyAaeyoVOqKUtQ4mdscMhIutrJABUX4U7PUhYWAWhv5PBvtLGa
 wfL/UA3aoDOK69M9z8ubdKhfKPKUCyU2BIgNdY3T5UunSEhLVAdHsp/1E1nguYQo
 x/ZzcO5+BZpNUeV3XStC4KTr5tl42TwTpXhmV9yVi+EGycRRp51U3GDyh7R3rRbO
 5R5c6iunwRth3R3vmRLLxU89sOfaq5m7qOkOerKtxAwvsAzCizLHyZIQdP1KMHsa
 fC3DmPIJZKdjI0Z5yED8mzEi25aT0rZiTMPwTmSyoKZ5SHH6udfHtraNTJwKsNIY
 Q4+9q21Q/tDV65xgzk/u9VFQl3S4ZuH54gdQ8s3UR0TWEmZyfSQ84qxTNu9Obeq5
 TqjwzJLktJ3yPKR6TEeiT/yRelmTobQGLbdwq+x0dcxfbgh/IE897my3OcsS3MjG
 X0kvNISSbtLZe2ShKX+sVTukQP+88IS84F5skGhkEpityUPkf4fyMo4x/tX/ygpN
 PceRsKJdoCEwoHGgN/jC
 =hj4e
 -----END PGP SIGNATURE-----

Merge tag 'imx-drm-next-2016-08-12' of git://git.pengutronix.de/git/pza/linux into drm-next

imx-drm updates and encoder atomic_mode_set helper callback

- add pixel clock and DE polarity configuration from device tree
  using display timing bindings for parallel and LVDS output
- cleanup/remove trivial functions
- cleanup and fixes in preparation for capture support
- add atomic_mode_set helper and use it in imx-ldb - this is an
  alternative to the encoder mode_set callback that passes the
  crtc and connector state instead of just the mode. It allows
  drivers to get information from the attached connector without
  having to iterate over all connectors
- add drm_bridge support to imx-ldb, for bridges attached via LVDS

* tag 'imx-drm-next-2016-08-12' of git://git.pengutronix.de/git/pza/linux:
  drm/imx-ldb: Add support to drm-bridge
  drm/imx: imx-ldb: use encoder atomic_mode_set callback
  drm/atomic-helper: Add atomic_mode_set helper callback
  drm/imx: Remove imx_drm_handle_vblank()
  gpu: ipu-v3: Add missing IDMAC channel names
  gpu: ipu-v3: rename CSI client device
  gpu: ipu-v3: Fix IRT usage
  gpu: ipu-v3: Fix CSI data format for 16-bit media bus formats
  gpu: ipu-v3: set correct full sensor frame for PAL/NTSC
  gpu: ipu-v3: Add VDI input IDMAC channels
  gpu: ipu-v3: Add ipu_get_num()
  gpu: ipu-cpmem: Add ipu_cpmem_get_burstsize()
  gpu: ipu-cpmem: Add ipu_cpmem_set_uv_offset()
  drm/imx: Remove imx_drm_crtc_id()
  drm/imx: Remove imx_drm_crtc_vblank_get/_put()
  drm/imx: convey the pixelclk-active and de-active flags from DT to the ipu-di driver
  drm: add a helper function to extract 'de-active' and 'pixelclk-active' from DT
2016-08-15 16:41:45 +10:00
Dave Airlie
c13eb9315f mediatek-drm maintainers and gamma correction
- add MAINTAINERS entry for mediatek-drm driver
 - add support for AAL and GAMMA engines
 - hook up gamma correction LUT
 - add support for temporal dithering to OD and GAMMA engines
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJXrXDqAAoJEFDCiBxwnmDrWFkP/RtEAKhOm5mzNksZb7FPv5Cd
 ngh/jN/9+q7JFoMstj/k9j0cUh3FOBQ2kh3AqKjH6R9PgZt9H9FGCgQWgVI7BNps
 yGZBSPogy3n+T9uZP0HTJRWXCnvaJbK5otNsmHmpdwHnXjUu6Li4aNDeg22MRiEI
 qQTE4UPm2Ry8FCRnQHYYXFMJFzYQQwWco5aq6Oja2YBvOunHuS/TcWQmW41kuPTc
 /V2XbF3Zx5RTfoOpBEvp5/tay3dKc2kE0FtggdzGBvJ4+b/Hx6c7ZvXH568C3nBw
 EqUC9+c3r/LA+KP3vALYF59gtt1ZzTsLB2sArGoo4UkNHNMjzqjjbANJDekezVhZ
 M8nbLgRWY/zF0u4/3xdr8t6VlPeUSUN85NW+8e2tN+CQu40crr/TRa7qB+JtNrLu
 09SjQ9qKkByvGA27jrQmjF5xtqrnxnB0XDh1GiAiesMGeHXCBeserCHS6tsKhgMX
 aa4uhuMW1TeBHNvfmjYJ4Vrl2OlqOk84mCyTFjsByMfJAOav8QGepasmbtE000PZ
 6IgtRn2E40Ml2zRBDOUm+B1EKLkJ5GGlwdkFTwBN15C7SDvobsdvKxN0KqLqbLhH
 mKBEepexdeNSYN7SpuXLXKyO+4JDaEW77qGy482D/0vUjvsogYhcCU6yvW53bOlQ
 QU7yz2Vm2oMdzhULwHrg
 =rl4u
 -----END PGP SIGNATURE-----

Merge tag 'mediatek-drm-next-2016-08-12' of git://git.pengutronix.de/git/pza/linux into drm-next

mediatek-drm maintainers and gamma correction

- add MAINTAINERS entry for mediatek-drm driver
- add support for AAL and GAMMA engines
- hook up gamma correction LUT
- add support for temporal dithering to OD and GAMMA engines

* tag 'mediatek-drm-next-2016-08-12' of git://git.pengutronix.de/git/pza/linux:
  drm/mediatek: set mt8173 dithering function
  drm/mediatek: Add gamma correction.
  drm/mediatek: Add GAMMA engine basic function
  drm/mediatek: Add AAL engine basic function
  drm: mediatek: add Maintainers entry for Mediatek DRM drivers
2016-08-15 16:41:17 +10:00
Dave Airlie
db4743227b Merge branch 'drm-next-tilcdc-atomic' of https://github.com/jsarha/linux into drm-next
Please pull tilcdc atomic modeset support and some non critical fixes.

* 'drm-next-tilcdc-atomic' of https://github.com/jsarha/linux: (29 commits)
  drm/tilcdc: Change tilcdc_crtc_page_flip() to tilcdc_crtc_update_fb()
  drm/tilcdc: Remove unnecessary pm_runtime_get() and *_put() calls
  drm/tilcdc: Get rid of legacy dpms mechanism
  drm/tilcdc: Use drm_atomic_helper_resume/suspend()
  drm/tilcdc: Enable and disable interrupts in crtc start() and stop()
  drm/tilcdc: tfp410: Add atomic modeset helpers to connector funcs
  drm/tilcdc: tfp410: Set crtc panel info at init phase
  drm/tilcdc: panel: Add atomic modeset helpers to connector funcs
  drm/tilcdc: panel: Set crtc panel info at init phase
  drm/tilcdc: Remove tilcdc_verify_fb()
  drm/tilcdc: Remove obsolete crtc helper functions
  drm/tilcdc: Set DRIVER_ATOMIC and use atomic crtc helpers
  drm/tilcdc: Add drm_mode_config_reset() call to tilcdc_load()
  drm/tilcdc: Add atomic mode config funcs
  drm/tilcdc: Add tilcdc_crtc_atomic_check()
  drm/tilcdc: Add tilcdc_crtc_mode_set_nofb()
  drm/tilcdc: Initialize dummy primary plane from crtc init
  drm/tilcdc: Add dummy primary plane implementation
  drm/tilcdc: Make tilcdc_crtc_page_flip() work if crtc is not yet on
  drm/tilcdc: Make tilcdc_crtc_page_flip() public
  ...
2016-08-15 16:39:28 +10:00
Chris Wilson
02bef8f98d drm/i915: Unbind closed vma for i915_gem_object_unbind()
Closed vma are removed from the obj->vma_list so that they cannot be
found by userspace. However, this means that when forcibly unbinding an
object, we have to wait upon all rendering to that object first in order
for the closed, but active, vma to be reaped and their bindings removed.

Reported-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97343
Fixes: aa653a685d ("drm/i915: Be more careful when unbinding vma")
Fixes: 8a3b3d576c (" drm/i915: Convert non-blocking userptr waits...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471196681-30043-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
2016-08-14 19:40:09 +01:00
Chris Wilson
35a9611ca0 drm/i915: Initialize return value for empty i915_gem_object_unbind()
If the obj->vma_list is empty, we immediately return ret. However, we
are doing so having never set it to any value, it should be zero!

Reported-by: Matthew Auld <matthew.auld@intel.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=97343
Fixes: aa653a685d ("drm/i915: Be more careful when unbinding vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471196681-30043-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-08-14 19:38:26 +01:00
Chris Wilson
0b1de5d58e drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory
This patch provides the infrastructure for performing a 16-byte aligned
read from WC memory using non-temporal instructions introduced with sse4.1.
Using movntdqa we can bypass the CPU caches and read directly from memory
and ignoring the page attributes set on the CPU PTE i.e. negating the
impact of an otherwise UC access. Copying using movntdqa from WC is almost
as fast as reading from WB memory, modulo the possibility of both hitting
the CPU cache or leaving the data in the CPU cache for the next consumer.
(The CPU cache itself my be flushed for the region of the movntdqa and on
later access the movntdqa reads from a separate internal buffer for the
cacheline.) The write back to the memory is however cached.

This will be used in later patches to accelerate accessing WC memory.

v2: Report whether the accelerated copy is successful/possible.
v3: Function alignment override was only necessary when using the
function target("sse4.1") - which is not necessary for emitting movntdqa
from __asm__.
v4: Improve notes on CPU cache behaviour vs non-temporal stores.
v5: Fix byte offsets for unrolled moves.
v6: Find all remaining typos of "movntqda", use kernel_fpu_begin.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Akash Goel <akash.goel@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-2-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:37 +01:00
Chris Wilson
d31d7cb146 drm/i915: Support for creating write combined type vmaps
vmaps has a provision for controlling the page protection bits, with which
we can use to control the mapping type, e.g. WB, WC, UC or even WT.
To allow the caller to choose their mapping type, we add a parameter to
i915_gem_object_pin_map - but we still only allow one vmap to be cached
per object. If the object is currently not pinned, then we recreate the
previous vmap with the new access type, but if it was pinned we report an
error. This effectively limits the access via i915_gem_object_pin_map to a
single mapping type for the lifetime of the object. Not usually a problem,
but something to be aware of when setting up the object's vmap.

We will want to vary the access type to enable WC mappings of ringbuffer
and context objects on !llc platforms, as well as other objects where we
need coherent access to the GPU's pages without going through the GTT

v2: Remove the redundant braces around pin count check and fix the marker
     in documentation (Chris)

v3:
- Add a new enum for the vmalloc mapping type & pass that as an argument to
   i915_object_pin_map. (Tvrtko)
- Use PAGE_MASK to extract or filter the mapping type info and remove a
   superfluous BUG_ON.(Tvrtko)

v4:
- Rename the enums and clean up the pin_map function. (Chris)

v5: Drop the VM_NO_GUARD, minor cosmetics.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471001999-17787-1-git-send-email-chris@chris-wilson.co.uk
2016-08-12 13:06:36 +01:00
Peter Chen
5a78ff7bf7 Revert "gpu: drm: omapdrm: dss-of: add missing of_node_put after calling of_parse_phandle"
This reverts
commit 2ab9f58791
Author: Peter Chen <peter.chen@nxp.com>
Date:   Fri Jul 15 11:17:03 2016 +0800
    gpu: drm: omapdrm: dss-of: add missing of_node_put after calling of_parse_phandle

The of_get_next_parent will drop refcount on the passed node, so the reverted
patch is wrong, thanks for Tomi Valkeinen points it.

Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1470908694-16362-1-git-send-email-peter.chen@nxp.com
2016-08-12 07:10:37 -04:00
Daniel Vetter
0c756134cf drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION
For reasons that entirely elude me fb.h exposes all the structures,
even when it is not enabled. Except for special stuff like fb_defio.

Which means all the drivers which haven't yet switched over to the
defio support in the helpers and still roll their own, will fail
to compile when fbdev emulation is disabled. Protect just those
bits, as a gnarly reminder that conversion to the core defio helpers
would be good.

Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-6-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:42:00 +02:00
Daniel Vetter
9db7d2b20c drm/radeon|amgpu: Make fbdev emulation optional
Seems to at least compile fine, only change needed was to use
the fb_set_suspend helper.

Cc: alexander.deucher@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-5-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:53 +02:00
Daniel Vetter
183cd49f1d drm/vmwgfx: select CONFIG_FB
vmwgfx doesn't actually build without that.

It would be great if vmwgfx could switch over to the fbdev emulation
helpers, since those will take care of all this already. I guess one
reason to not do that was lack of defio support in the helpers, but
that's fixed now. If there's any other hold-up, we should figure out
what it is and whether it makes sense to address it.

Cc: Sinclair Yeh <syeh@vmware.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-4-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:46 +02:00
Daniel Vetter
b1116f645c drm: Remove superflous linux/fb.h includes
Everyone who uses the fbdev emulation helpers doesn't need to include
fb.h directly. Remove it.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-3-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:39 +02:00
Daniel Vetter
44adece57e drm/fb-helper: Add a dummy remove_conflicting_framebuffers
Lots of drivers don't properly compile without this when CONFIG_FB=n.
It's kinda a hack, but since CONFIG_FB doesn't stub any fucntions when
it's disabled I think it makes sense to add it to drm_fb_helper.h.

Long term we probably need to rethink all the logic to unload firmware
framebuffer drivers, at least if we want to be able to move away from
CONFIG_FB and fbcon.

v2: Unfortunately just stubbing out remove_conflicting_framebuffers in
drm_fb_helper.h upset gcc about static vs. non-static declarations, so
a new wrapper it needs to be. Means more churn :(

Cc: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: tomi.valkeinen@ti.com
Cc: dh.herrmann@gmail.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470847958-28465-2-git-send-email-daniel.vetter@ffwll.ch
2016-08-12 10:41:18 +02:00
Ville Syrjälä
8d970654b7 drm/i915: Deal with NV12 CbCr plane AUX surface on SKL+
With NV12 we have two color planes to deal with so we must compute the
surface and x/y offsets for the second plane as well.

What makes this a bit nasty is that the hardware expects the surface
offset to be specified as a distance from the main surface offset.
What's worse, the distance must be non-negative (no neat wraparound or
anything). So we must make sure that the main surface offset is always
less or equal to the AUX surface offset. We do that by computing the AUX
offset first and the main surface offset second. If the main surface
offset ends up being above the AUX offset, we just push it down as far
as is required while still maintaining the required alignment etc.

Fortunately the AUX offset only reuqires 4K alignment, so we don't need
to do any of the backwards searching for an acceptable offset that we
must do for the main surface. And X tiled + NV12 isn't a supported
combination anyway.

Note that this just computes aux surface offsets, we do not yet program
them into the actual hardware registers, and hence we can't yet expose
NV12.

v2: Rebase due to drm_plane_state src/dst rects
    s/TODO.../something else/ in the commit message/ (Daniel)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-12-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:35:23 +03:00
Ville Syrjälä
b63a16f6cd drm/i915: Compute display surface offset in the plane check hook for SKL+
SKL has nasty limitations with the display surface offsets:
* source x offset + width must be less than the stride for X tiled
  surfaces or the display engine falls over
* the surface offset requires lots of alignment (256K or 1M)

These facts mean that we can't just pick any suitably aligned tile
boundary as the offset and expect the resulting x offset to be useable.
The solution is to start with the closest boundary as before, but then
keep searching backwards until we find one that works, or don't. This
means we must be prepared to fail, hence the whole surface offset
calculation needs to be moved to the .check_plane() hook from the
.update_plane() hook.

While at it we can check that the source width/height don't exceed
maximum plane size limits.

We'll store the results of the computation in the plane state to make
it easy for the .update_plane() hook to do its thing.

v2: Replace for+break loop with while loop
    Rebase due to drm_plane_state src/dst rects
    Rebase due to plane_check_state()

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-11-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:35:10 +03:00
Ville Syrjälä
66a2d927cb drm/i915: Make intel_adjust_tile_offset() work for linear buffers
To make life less surprising we can make intel_adjust_tile_offset()
deal with linear buffers as well. Currently it doesn't seem like there's
a real need for this since only X tiling and NV12 (which would always
be tiled currently) should need it. But I've used it for some debug
hacks already so seems like a reasonable thing to have.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-10-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:34:39 +03:00
Ville Syrjälä
b9b2403845 drm/i915: Allow calling intel_adjust_tile_offset() multiple times
Minimize the resulting X coordinate after intel_adjust_tile_offset() is
done with it's offset adjustment. This allows calling
intel_adjust_tile_offset() multiple times in case we need to adjust
the offset several times.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-9-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:34:29 +03:00
Ville Syrjälä
60d5f2a45b drm/i915: Limit fb x offset due to fences
If there's a fence on the object it will be aligned to the start
of the object, and hence CPU rendering to any fb that straddles
the fence edge will come out wrong due to lines wrapping at the
wrong place.

We have no API to manage fences on a sub-object level, so we can't
really fix this in any way. Additonally gen2/3 fences are rather
coarse grained so adjusting the offset migth not even be possible.

Avoid these problems by requiring the fb layout to agree with the
fence layout (if present).

v2: Rebase due to i915_gem_object_get_tiling() & co.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-8-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:34:12 +03:00
Ville Syrjälä
c2ff7370ae drm/i915: Adjust obj tiling vs. fb modifier rules
Currently we require the object to be X tiled if the fb is X
tiled. The argument is supposedly FBC GTT tracking. But
actually that no longer holds water since FBC supports
Y tiling as well on SKL+.

A better rule IMO is to require that if there is a fence, the
fb modifier match the object tiling mode. But if the object is linear,
we can allow the fb modifier to be anything. The idea being that
if the user set the tiling mode on the object, presumably the intention
is to actually use the fence for CPU access. But if the tiling mode is
not set, the user has no intention of using a fence (and can't actually
since we disallow tiling mode changes when there are framebuffers
associated with the object).

On gen2/3 we must keep to the rule that the object and fb
must be either both linear or both X tiled. No mixing allowed
since the display engine itself will use the fence if it's present.

v2: Fix typos
v3: Rebase due to i915_gem_object_get_tiling() & co.

Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-7-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:34:07 +03:00
Ville Syrjälä
72618ebfde drm/i915: Use fb modifiers for display tiling decisions
Soon the fence tiling mode may not always match the fb modifier
even for X tiled buffers. So let's use the fb modifier
consistently for all display tiling decisions.

v2: Rebased due to s/ring/engine/
v3: Rebased due to s/engine/ring/ O_o
v4: Rebase due to i915_gem_object_get_tiling() & co.

Reviewed-by: Matthew Auld <matthew.auld@intel.com> (v1)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-6-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:43 +03:00
Ville Syrjälä
2949056c86 drm/i915: Pass around plane_state instead of fb+rotation
intel_compute_tile_offset() and intel_add_fb_offsets() get passed the fb
and the rotation. As both of those come from the plane state we can just
pass that in instead.

For extra consitency pass the plane state to intel_fb_xy_to_linear() as
well even though it only really needs the fb.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-5-git-send-email-ville.syrjala@linux.intel.com
2016-08-11 18:33:32 +03:00