Commit Graph

651 Commits

Author SHA1 Message Date
Chris Wilson
7c3f86b6dc drm/i915: Invalidate the guc ggtt TLB upon insertion
Move the GuC invalidation of its ggtt TLB to where we perform the ggtt
modification rather than proliferate it into all the callers of the
insert (which may or may not in fact have to do the insertion).

v2: Just do the guc invalidate unconditionally, (afaict) it has no impact
without the guc loaded on gen8+
v3: Conditionally invalidate the guc - just in case that register has
not been validated for other modes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170112110050.25333-1-chris@chris-wilson.co.uk
2017-01-12 16:47:04 +00:00
Chris Wilson
606fec956c drm/i915: Prefer random replacement before eviction search
Performing an eviction search can be very, very slow especially for a
range restricted replacement. For example, a workload like
gem_concurrent_blit will populate the entire GTT and then cause aperture
thrashing. Since the GTT is a mix of active and inactive tiny objects,
we have to search through almost 400k objects before finding anything
inside the mappable region, and as this search is required before every
operation performance falls off a cliff.

Instead of performing the full search, we do a trial replacement of the
node at a random location fitting the specified restrictions. We lose
the strict LRU property of the GTT in exchange for avoiding the slow
search (several orders of runtime improvement for gem_concurrent_blit
4KiB-global-gtt, e.g. from 5000s to 20s). The loss of LRU replacement is
(later) mitigated firstly by only doing replacement if we find no
freespace and secondly by execbuf doing a PIN_NONBLOCK search first before
it starts thrashing (i.e. the random replacement will only occur from the
already inactive set of objects).

v2: Ascii-art, and check preconditionst
v3: Rephrase final sentence in comment to explain why we don't bother
with if (i915_is_ggtt(vm)) for preferring random replacement.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-3-chris@chris-wilson.co.uk
2017-01-11 12:28:37 +00:00
Chris Wilson
625d988acc drm/i915: Extract reserving space in the GTT to a helper
Extract drm_mm_reserve_node + calling i915_gem_evict_for_node into its
own routine so that it can be shared rather than duplicated.

v2: Kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: igvt-g-dev@lists.01.org
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-2-chris@chris-wilson.co.uk
2017-01-11 12:28:13 +00:00
Chris Wilson
e007b19d7b drm/i915: Use the MRU stack search after evicting
When we evict from the GTT to make room for an object, the hole we
create is put onto the MRU stack inside the drm_mm range manager. On the
next search pass, we can speed up a PIN_HIGH allocation by referencing
that stack for the new hole.

v2: Pull together the 3 identical implements (ahem, a couple were
outdated) into a common routine for allocating a node and evicting as
necessary.
v3: Detect invalid calls to i915_gem_gtt_insert()
v4: kerneldoc

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170111112312.31493-1-chris@chris-wilson.co.uk
2017-01-11 12:25:15 +00:00
Chris Wilson
f51455d442 drm/i915: Replace 4096 with PAGE_SIZE or I915_GTT_PAGE_SIZE
Start converting over from the byte count to its semantic macro, either
we want to allocate the size of a physical page in main memory or we
want the size of a virtual page in the GTT. 4096 could mean either, but
PAGE_SIZE and I915_GTT_PAGE_SIZE are explicit and should help improve
code comprehension and future changes. In the future, we may want to use
variable GTT page sizes and so have the challenge of knowing which
hardcoded values were used to represent a physical page vs the virtual
page.

v2: Look for a few more 4096s to convert, discover IS_ALIGNED().
v3: 4096ul paranoia, make fence alignment a distinct value of 4096, keep
bdw stolen w/a as 4096 until we know better.
v4: Add asserts that i915_vma_insert() start/end are aligned to GTT page
sizes.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170110144734.26052-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-10 20:54:32 +00:00
Matthew Auld
9e65a37872 drm/i915: don't open code the pdpe/pml4e clearing
Now that it's obvious what the helpers do, we can simplify the code
somewhat by using them when clearing the pdpe/pml4e with the relevant
scratch entry.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213160512.7008-1-matthew.auld@intel.com
2017-01-09 16:43:39 +02:00
Matthew Auld
5684310760 drm/i915: s/gen8_setup_page_directory_pointer/gen8_setup_pml4e/
The function name gen8_setup_page_directory_pointer is misleading, and
only serves to confuse the reader, it's not setting up a pdp, but
rather encoding a specific pml4e with a given pdp.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-09 16:43:38 +02:00
Matthew Auld
5c693b2b8a drm/i915: s/gen8_setup_page_directory/gen8_setup_pdpe/
The function name gen8_setup_page_directory is misleading, and only
serves to confuse the reader, it's not setting up a pd, but rather
encoding a specific pdpe with a given pd.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2017-01-09 16:43:38 +02:00
Chris Wilson
1a292fa53d drm/i915: Purge loose pages if we run out of DMA remap space
If the DMA remap fails, one cause can be that we have too many objects
pinned in a small remapping table, such as swiotlb. (DMA remapping does
not trigger the shrinker by itself on its normal failure paths.)  So try
purging all other objects (using i915_gem_shrink_all(), sparing our own
pages as we have yet to assign them to the obj->pages) and try again. If
there are no pages to reclaim (and consequently no pages to unmap), the
shrinker will report 0 and we fail with -ENOSPC as before.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152240.5793-2-chris@chris-wilson.co.uk
2017-01-06 16:54:17 +00:00
Chris Wilson
edd1f2fe11 drm/i915: Use fixed-sized types for stolen
Stolen memory is a hardware resource of known size, so use an accurate
fixed integer type rather than the ambiguous variable size_t. This was
motivated by the next patch spotting inconsistencies in our types.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170106152013.24684-3-chris@chris-wilson.co.uk
2017-01-06 16:02:09 +00:00
Chris Wilson
db9309a526 drm/i915/guc: Exclude the upper end of the Global GTT for the GuC
The GuC uses a special mapping for the upper end of the Global GTT,
similar to the way it uses a special mapping for the lower end, so
exclude it from our drm_mm to prevent us using it.

v2: Rename to reflect that it is unmappable similar to the region at the
bottom of the GGTT, and couple it into the assertion that we don't feed
unmappable addresses to the GuC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170105153023.30575-5-chris@chris-wilson.co.uk
2017-01-05 15:34:46 +00:00
Daniel Vetter
ef426c1038 Merge tag 'drm-misc-next-2016-12-30' of git://anongit.freedesktop.org/git/drm-misc into drm-intel-next-queued
Directly merge drm-misc into drm-intel since Dave is on vacation and
we need the various drm-misc patches (fb format rework, drm mm fixes,
selftest framework and others). Also pulled back -rc2 in first to
resync with drm-intel-fixes and make sure I can reuse the exact rerere
solutions from drm-tip for safety, and because I'm lazy.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-01-04 11:41:10 +01:00
Matthew Auld
7a0499a4b8 drm/i915: move vma sanity checking into i915_vma_bind
If we move the sanity checking from gen8_alloc_va_range_3lvl and
gen6_alloc_va_range into i915_vma_bind, we will increase our coverage to
now both callbacks. We also convert each WARN_ON over to a GEM_WARN_ON.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161213203222.32564-2-matthew.auld@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-12-16 21:17:22 +00:00
Chris Wilson
b44f97fd78 drm/i915: Simplify i915_gtt_color_adjust()
If we remember that node_list is a circular list containing the fake
head_node, we can use a simple list_next_entry() and skip the NULL check
for the allocated check against the head_node.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161216074718.32500-3-chris@chris-wilson.co.uk
2016-12-16 16:36:41 +01:00
Chris Wilson
45b186f111 drm: Constify the drm_mm API
Mark up the pointers as constant through the API where appropriate.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161216074718.32500-5-chris@chris-wilson.co.uk
2016-12-16 14:38:49 +01:00
Michel Thierry
9e1d0e604e drm/i915: Advertise ppgtt support type in platform definition
Instead of being hidden in sanitize_enable_ppgtt.
It also seems to be the place to do so nowadays.

Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-12-07 07:09:55 +00:00
Chris Wilson
85fd4f58d7 drm/i915: Mark all non-vma being inserted into the address spaces
We need to distinguish between full i915_vma structs and simple
drm_mm_nodes when considering eviction (i.e. we must be careful not to
treat a mere drm_mm_node as a much larger i915_vma causing memory
corruption, if we are lucky). To do this, color these not-a-vma with -1
(I915_COLOR_UNEVICTABLE).

v2...v200: New name for -1.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161205142941.21965-1-chris@chris-wilson.co.uk
2016-12-05 20:49:17 +00:00
Ander Conselvan de Oliveira
cc3f90f063 drm/i915/glk: Reuse broxton code for geminilake
Geminilake is mostly backwards compatible with broxton, so change most
of the IS_BROXTON() checks to IS_GEN9_LP(). Differences between the
platforms will be implemented in follow-up patches.

v2: Don't reuse broxton's path in intel_update_max_cdclk().
    Don't set plane count as in broxton.

v3: Rebase

v4: Include the check intel_bios_is_port_hpd_inverted().
    Commit message.

v5: Leave i915_dmc_info() out; glk's csr version != bxt's. (Rodrigo)

v6: Rebase.

v7: Convert a few mode IS_BROXTON() occurances in pps, ddi, dsi and pll
    code. (Rodrigo)

v8: Squash a couple of DDI patches with more conversions. (Rodrigo)

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1480667037-11215-2-git-send-email-ander.conselvan.de.oliveira@intel.com
2016-12-02 16:38:56 +02:00
Chris Wilson
c6385c947f drm/i915: Fix tracepoint compilation
drivers/gpu/drm/i915/./i915_trace.h: In function ‘trace_event_raw_event_i915_gem_evict’:
drivers/gpu/drm/i915/./i915_trace.h:409:24: error: ‘struct i915_address_space’ has no member named ‘dev’
       __entry->dev = vm->dev->primary->index;

A couple of macros missed in the s/vm->dev/vm->i915/ conversion.

Fixes: 49d73912cb ("drm/i915: Convert vm->dev backpointer to vm->i915")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129124205.19351-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
2016-11-29 12:54:03 +00:00
Chris Wilson
49d73912cb drm/i915: Convert vm->dev backpointer to vm->i915
99% of the time we access i915_address_space->dev we want the i915
device and not the drm device, so let's store the drm_i915_private
backpointer instead. The only real complication here are the inlines
in i915_vma.h where drm_i915_private is not yet defined and so we have
to choose an alternate path for our asserts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161129095008.32622-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-29 11:38:00 +00:00
Zhi Wang
a18dbba8f0 drm/i915: Move the release of PT page to the upper caller
a PT page will be released if it doesn't contain any meaningful mappings
during PPGTT page table shrinking. The PT entry in the upper level will
be set to a scratch entry.

Normally this works nicely, but in virtualization world, the PPGTT page
table is tracked by hypervisor. Releasing the PT page before modifying
the upper level PT entry would cause extra efforts.

As the tracked page has been returned to OS before losing track from
hypervisor, it could be written in any pattern. Hypervisor has to recognize
if a page is still being used as a PT page by validating these writing
patterns. It's complicated. Better let the guest modify the PT entry in
upper level PT first, then release the PT page.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhiyuan Lv <zhiyuan.lv@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://patchwork.freedesktop.org/patch/122697/msgid/1479728666-25333-1-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1480402516-22275-1-git-send-email-zhi.a.wang@intel.com
2016-11-29 09:29:56 +00:00
Matthew Auld
ed9724ddde drm/i915: add i915_address_space_fini
We already have an i915_address_space_init, so for symmetry we should
also have a _fini, plus we already open code it twice. This then also
fixes a bug where we leak the timeline for the ggtt vm.

v2: don't forget about the struct_mutex for the ggtt path.

Fixes: 80b204bce8 ("drm/i915: Enable multiple timelines")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20161117210411.14044-1-chris@chris-wilson.co.uk
2016-11-17 21:05:36 +00:00
Tvrtko Ursulin
7ace3d3024 drm/i915: dev_priv cleanup in i915_gem_stolen.c
And a little bit of cascaded function prototype changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17 13:56:16 +00:00
Tvrtko Ursulin
275a991c03 drm/i915: dev_priv cleanup in i915_gem_gtt.c
Started with removing INTEL_INFO(dev) and cascaded into a quite
big trickle of function prototype changes. Still, I think it is
for the better.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17 13:56:12 +00:00
Tvrtko Ursulin
c6be607abc drm/i915: dev_priv and a small cascade of cleanups in i915_gem.c
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-11-17 13:55:26 +00:00
Tvrtko Ursulin
b7f05d4ae0 drm/i915: Pass dev_priv to INTEL_INFO everywhere apart from the gen use
After this patch only conversion of INTEL_INFO(p)->gen to
INTEL_GEN(dev_priv) remains before the __I915__ macro can
be removed.

v2: Tidy vlv_compute_wm. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-11-11 14:58:26 +00:00
Joonas Lahtinen
b42fe9ca0a drm/i915: Split out i915_vma.c
As a side product, had to split two other files;
- i915_gem_fence_reg.h
- i915_gem_object.h (only parts that needed immediate untanglement)

I tried to move code in as big chunks as possible, to make review
easier. i915_vma_compare was moved to a header temporarily.

v2:
- Use i915_gem_fence_reg.{c,h}

v3:
- Rebased

v4:
- Fix building when DEBUG_GEM is enabled by reordering a bit.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478861034-30643-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-11 14:34:54 +02:00
Chris Wilson
dfd2812e6d drm/i915: Remove the vma from the object list upon close
Currently, the vma is being unlink from the object lookup on destroy.
However, we are meant to be decoupling it upon close so that the user
cannot access the closed vma whilst it remains active on the GPU.

[   34.074858] kernel BUG at drivers/gpu/drm/i915/i915_gem_gtt.c:3561!
[   34.074875] invalid opcode: 0000 [#1] PREEMPT SMP
[   34.074888] Modules linked in: snd_hda_intel i915 x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel lpc_ich mei_me mei snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec snd_hwdep snd_hda_core i2c_designware_platform i2c_designware_core snd_pcm e1000e ptp pps_core sdhci_acpi sdhci mmc_core i2c_hid [last unloaded: i915]
[   34.075010] CPU: 1 PID: 6224 Comm: gem_close_race Tainted: G     U          4.9.0-rc3-CI-CI_DRM_1800+ #1
[   34.075034] Hardware name:                  /NUC5i7RYB, BIOS RYBDWi35.86A.0355.2016.0224.1501 02/24/2016
[   34.075057] task: ffff8802459a8040 task.stack: ffffc90000524000
[   34.075074] RIP: 0010:[<ffffffffa0392cbc>]  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075118] RSP: 0018:ffffc90000527b68  EFLAGS: 00010202
[   34.075135] RAX: ffff8802426c5e40 RBX: 0000000000000000 RCX: ffff8802447fc2a8
[   34.075158] RDX: 0000000000000000 RSI: ffff8802447fc2a8 RDI: ffff880248a4a880
[   34.075181] RBP: ffffc90000527b88 R08: 0000000000000008 R09: 0000000000000000
[   34.075203] R10: 0000000000000001 R11: 0000000000000000 R12: ffff880248a4a880
[   34.075225] R13: ffff8802447fc2a8 R14: ffff880243e9afa8 R15: ffff880248a4a9c8
[   34.075248] FS:  00007f9b43e59740(0000) GS:ffff880256c80000(0000) knlGS:0000000000000000
[   34.075273] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.075292] CR2: 00007f9b43419140 CR3: 000000024455d000 CR4: 00000000003406e0
[   34.075314] Stack:
[   34.075323]  0000000000000000 ffffc90000527bd0 ffff880243cb8008 ffff880243e9afa8
[   34.075353]  ffffc90000527c08 ffffffffa03874c7 ffffc90000527bb8 ffff880243e9afa8
[   34.075383]  ffff880243e9afb0 ffffc90000527e10 ffff8802447fc2a8 ffff880243cb8040
[   34.075414] Call Trace:
[   34.075435]  [<ffffffffa03874c7>] eb_lookup_vmas.isra.7+0x247/0x330 [i915]
[   34.075468]  [<ffffffffa0388c34>] i915_gem_do_execbuffer.isra.15+0x604/0x1a10 [i915]
[   34.075507]  [<ffffffffa039c957>] ? i915_gem_object_get_sg+0x347/0x380 [i915]
[   34.075532]  [<ffffffff811a69ce>] ? __might_fault+0x3e/0x90
[   34.075562]  [<ffffffffa038a430>] i915_gem_execbuffer2+0xc0/0x250 [i915]
[   34.075585]  [<ffffffff81552926>] drm_ioctl+0x1f6/0x480
[   34.075604]  [<ffffffff8100107a>] ? trace_hardirqs_on_thunk+0x1a/0x1c
[   34.075635]  [<ffffffffa038a370>] ? i915_gem_execbuffer+0x330/0x330 [i915]
[   34.075658]  [<ffffffff81202d2e>] do_vfs_ioctl+0x8e/0x690
[   34.075677]  [<ffffffff8181582d>] ? _raw_spin_unlock_irqrestore+0x3d/0x60
[   34.075700]  [<ffffffff810fcd51>] ? SyS_timer_settime+0x141/0x1e0
[   34.075721]  [<ffffffff810d6de2>] ? trace_hardirqs_on_caller+0x122/0x1b0
[   34.075742]  [<ffffffff8120336c>] SyS_ioctl+0x3c/0x70
[   34.075760]  [<ffffffff8181602e>] entry_SYSCALL_64_fastpath+0x1c/0xb1
[   34.075781] Code: 44 a0 48 c7 c2 9a 7e 43 a0 be e0 0d 00 00 48 c7 c7 a0 45 44 a0 e8 55 b8 ce e0 48 85 db 74 a3 49 83 bd f8 03 00 00 00 74 99 0f 0b <0f> 0b 48 89 da 4c 89 ee 4c 89 e7 e8 04 a9 ff ff 48 89 da 49 89
[   34.075955] RIP  [<ffffffffa0392cbc>] i915_gem_obj_lookup_or_create_vma+0x8c/0xc0 [i915]
[   34.075994]  RSP <ffffc90000527b68>

Testcase: igt/gem_close_race/basic-threads
Fixes: db6c2b4151 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104161241.25871-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-11-07 11:32:24 +00:00
Chris Wilson
2c3a3f44dc drm/i915: Fix pages pin counting around swizzle quirk
commit bc0629a767 ("drm/i915: Track pages pinned due to swizzling
quirk") fixed one problem, but revealed a whole lot more. The root cause
of the pin count mismatch for the swizzle quirk (for L-shaped memory on
gen3/4) was that we were incrementing the pages_pin_count upon getting
the backing pages but then overwriting the pages_pin_count to set it to
1 afterwards. With a little bit of adjustment to satisfy the GEM_BUG_ON
sanitychecks, the fix is to replace the explicit atomic_set with an
atomic_inc.

v2: Consistently use atomics (not mix atomics and helpers) within the
lowlevel get_pages routines. This makes the atomic operations much
clearer.

Fixes: 1233e2db19 ("drm/i915: Move object backing storage manipulation")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161104103001.27643-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-11-04 11:55:39 +00:00
Chris Wilson
a44342acde drm/i915: Fix test on inputs for vma_compare()
When supplying a view to vma_compare() it is required that the supplied
i915_address_space is the global GTT. I tested the VMA instead (which is
the current position in the rbtree and maybe from any address space).

Reported-by: Matthew Auld <matthew.auld@intel.com>
Tested-by: Matthew Auld <matthew.auld@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98579
Fixes: db6c2b4151 ("drm/i915: Store the vma in an rbtree...")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161103200852.23431-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-11-03 21:00:43 +00:00
Joonas Lahtinen
56cea32382 drm/i915: Unify global_list into global_link
$ sed -i -r 's/\bglobal_list\b/global_link/g' *.c *.h

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1478081764-8058-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-11-02 15:17:13 +02:00
Mika Kuoppala
fce937559e drm/i915/gtt: Mark tlbs dirty on clear
Now when clearing ptes can modify upper level pdp's,
we need to mark them dirty so that they will be flushed
correctly.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1478006856-8313-1-git-send-email-mika.kuoppala@intel.com
2016-11-02 11:58:32 +02:00
Mika Kuoppala
37c6393431 drm/i915/gtt: Fix pte clear range
Comparing pte index to a number of entries is wrong
when clearing a range of pte entries. Use end marker
of 'one past' to correctly point adequate number of
ptes to the scratch page.

v2: assert early instead of warning late (Chris)
v3: removed consts (Joonas)

Fixes: d209b9c3cd ("drm/i915/gtt: Split gen8_ppgtt_clear_pte_range")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98282
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-11-02 11:58:03 +02:00
Chris Wilson
db6c2b4151 drm/i915: Store the vma in an rbtree under the object
With full-ppgtt one of the main bottlenecks is the lookup of the VMA
underneath the object. For execbuf there is merit in having a very fast
direct lookup of ctx:handle to the vma using a hashtree, but that still
leaves a large number of other lookups. One way to speed up the lookup
would be to use a rhashtable, but that requires extra allocations and
may exhibit poor worse case behaviour. An alternative is to use an
embedded rbtree, i.e. no extra allocations and deterministic behaviour,
but at the slight cost of O(lgN) lookups (instead of O(1) for
rhashtable). The major of such tree will be very shallow and so not much
slower, and still scales much, much better than the current unsorted
list.

v2: Bump vma_compare() to return a long, as we return the result of
comparing two pointers.

References: https://bugs.freedesktop.org/show_bug.cgi?id=87726
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161101115400.15647-1-chris@chris-wilson.co.uk
2016-11-01 13:00:40 +00:00
Chris Wilson
80b204bce8 drm/i915: Enable multiple timelines
With the infrastructure converted over to tracking multiple timelines in
the GEM API whilst preserving the efficiency of using a single execution
timeline internally, we can now assign a separate timeline to every
context with full-ppgtt.

v2: Add a comment to indicate the xfer between timelines upon submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-35-chris@chris-wilson.co.uk
2016-10-28 20:53:57 +01:00
Chris Wilson
d07f0e59b2 drm/i915: Move GEM activity tracking into a common struct reservation_object
In preparation to support many distinct timelines, we need to expand the
activity tracking on the GEM object to handle more than just a request
per engine. We already use the struct reservation_object on the dma-buf
to handle many fence contexts, so integrating that into the GEM object
itself is the preferred solution. (For example, we can now share the same
reservation_object between every consumer/producer using this buffer and
skip the manual import/export via dma-buf.)

v2: Reimplement busy-ioctl (by walking the reservation object), postpone
the ABI change for another day. Similarly use the reservation object to
find the last_write request (if active and from i915) for choosing
display CS flips.

Caveats:

 * busy-ioctl: busy-ioctl only reports on the native fences, it will not
warn of stalls (in set-domain-ioctl, pread/pwrite etc) if the object is
being rendered to by external fences. It also will not report the same
busy state as wait-ioctl (or polling on the dma-buf) in the same
circumstances. On the plus side, it does retain reporting of which
*i915* engines are engaged with this object.

 * non-blocking atomic modesets take a step backwards as the wait for
render completion blocks the ioctl. This is fixed in a subsequent
patch to use a fence instead for awaiting on the rendering, see
"drm/i915: Restore nonblocking awaits for modesetting"

 * dynamic array manipulation for shared-fences in reservation is slower
than the previous lockless static assignment (e.g. gem_exec_lut_handle
runtime on ivb goes from 42s to 66s), mainly due to atomic operations
(maintaining the fence refcounts).

 * loss of object-level retirement callbacks, emulated by VMA retirement
tracking.

 * minor loss of object-level last activity information from debugfs,
could be replaced with per-vma information if desired

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-21-chris@chris-wilson.co.uk
2016-10-28 20:53:50 +01:00
Chris Wilson
03ac84f183 drm/i915: Pass around sg_table to get_pages/put_pages backend
The plan is to move obj->pages out from under the struct_mutex into its
own per-object lock. We need to prune any assumption of the struct_mutex
from the get_pages/put_pages backends, and to make it easier we pass
around the sg_table to operate on rather than indirectly via the obj.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-13-chris@chris-wilson.co.uk
2016-10-28 20:53:47 +01:00
Chris Wilson
a4f5ea64f0 drm/i915: Refactor object page API
The plan is to make obtaining the backing storage for the object avoid
struct_mutex (i.e. use its own locking). The first step is to update the
API so that normal users only call pin/unpin whilst working on the
backing storage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-12-chris@chris-wilson.co.uk
2016-10-28 20:53:46 +01:00
Chris Wilson
d2a84a76a3 drm/i915: Use radixtree to jump start intel_partial_pages()
We can use the radixtree index of the obj->pages to find the start
position of the desired partial range.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-11-chris@chris-wilson.co.uk
2016-10-28 20:53:46 +01:00
Chris Wilson
4c7d62c6b8 drm/i915: Markup GEM API with lockdep asserts
Add lockdep_assert_held(struct_mutex) to the API preamble of the
internal GEM interfaces.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-9-chris@chris-wilson.co.uk
2016-10-28 20:53:45 +01:00
Chris Wilson
f8a7fde456 drm/i915: Defer active reference until required
We only need the active reference to keep the object alive after the
handle has been deleted (so as to prevent a synchronous gem_close). Why
then pay the price of a kref on every execbuf when we can insert that
final active ref just in time for the handle deletion?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161028125858.23563-6-chris@chris-wilson.co.uk
2016-10-28 20:53:43 +01:00
Chris Wilson
2eedfc7d58 drm/i915: Remove RPM sequence checking
We only used the RPM sequence checking inside the lowlevel GTT
accessors, when we had to rely on callers taking the wakeref on our
behalf. Now that we take the RPM wakeref inside the GTT management
routines themselves, we can forgo the sanitycheck of the callers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-4-chris@chris-wilson.co.uk
2016-10-24 13:45:37 +01:00
Chris Wilson
9c870d0367 drm/i915: Use RPM as the barrier for controlling user mmap access
We can remove the false coupling between RPM and struct mutex by the
observation that we can use the RPM wakeref as the barrier around user
mmap access. That is as we tear down the user's PTE atomically from
within rpm suspend and then to fault in new PTE requires the rpm
wakeref, means that no user access is possible through those PTE without
RPM being awake. Having made that observation, we can then remove the
presumption of having to take rpm outside of struct_mutex and so allow
fine grained acquisition of a wakeref around hw access rather than
having to remember to acquire the wakeref early on.

v2: Rejig placement of the new intel_runtime_pm_get() to be as tight
as possible around the GTT pread/pwrite.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161024124218.18252-2-chris@chris-wilson.co.uk
2016-10-24 13:45:35 +01:00
Michał Winiarski
2ce5179fe8 drm/i915/gtt: Free unused lower-level page tables
Since "Dynamic page table allocations" were introduced, our page tables
can grow (being dynamically allocated) with address space range usage.
Unfortunately, their lifetime is bound to vm. This is not a huge problem
when we're not using softpin - drm_mm is creating an upper bound on used
range by causing addresses for our VMAs to eventually be reused.

With softpin, long lived contexts can drain the system out of memory
even with a single "small" object. For example:

bo = bo_alloc(size);
while(true)
    offset += size;
    exec(bo, offset);

Will cause us to create new allocations until all memory in the system
is used for tracking GPU pages (even though almost all PTEs in this vm
are pointing to scratch).

Let's free unused page tables in clear_range to prevent this - if no
entries are used, we can safely free it and return this information to
the caller (so that higher-level entry is pointing to scratch).

v2: Document return value and free semantics (Joonas)
v3: No newlines in vars block (Joonas)
v4: Drop redundant local 'reduce' variable
v5: Handle CI fail with enable_ppgtt=2

Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-3-git-send-email-michal.winiarski@intel.com
2016-10-14 12:42:23 +01:00
Michał Winiarski
d209b9c3cd drm/i915/gtt: Split gen8_ppgtt_clear_pte_range
Let's use more top-down approach, where each gen8_ppgtt_clear_* function
is responsible for clearing the struct passed as an argument and calling
relevant clear_range functions on lower-level tables.
Doing this rather than operating on PTE ranges makes the implementation
of shrinking page tables quite simple.

v2: Drop min when calculating num_entries, no negation in 48b ppgtt
check, no newlines in vars block (Joonas)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-2-git-send-email-michal.winiarski@intel.com
2016-10-14 12:40:33 +01:00
Michał Winiarski
4fb84d991e drm/i915: Remove unused "valid" parameter from pte_encode
We never used any invalid ptes, those were put in place for
a possibility of doing gpu faults. However our batchbuffers are not
restricted in length, so everything needs to be pointing to something
and thus out-of-bounds is pointing to scratch.

Remove the valid flag as it is always true.

v2: Expand commit msg, patch reorder (Mika)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1476360162-24062-1-git-send-email-michal.winiarski@intel.com
2016-10-14 12:40:32 +01:00
Tvrtko Ursulin
5db9401983 drm/i915: Make IS_GEN macros only take dev_priv
Saves 1416 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476352990-2504-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-10-14 12:23:22 +01:00
Tvrtko Ursulin
920a14b245 drm/i915: Make IS_CHERRYVIEW only take dev_priv
Saves 864 bytes of .rodata strings and ~100 of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
e2d214ae2b drm/i915: Make IS_BROXTON only take dev_priv
Saves 1392 bytes of .rodata strings.

Also change a few function/macro prototypes in i915_gem_gtt.c
from dev to dev_priv where it made more sense to do so.

v2: Add parantheses around dev_priv. (Ville Syrjala)
v3: Mention function prototype changes. (David Weinehall)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
d9486e6501 drm/i915: Make IS_SKYLAKE only take dev_priv
Saves 1016 bytes of .rodata strings and couple hundred of .text.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
772c2a519c drm/i915: Make IS_HASWELL only take dev_priv
Saves 2432 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Tvrtko Ursulin
8652744b64 drm/i915: Make IS_BROADWELL only take dev_priv
Saves 1808 bytes of .rodata strings.

v2: Add parantheses around dev_priv. (Ville Syrjala)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2016-10-14 12:23:19 +01:00
Akash Goel
3b3f1650b1 drm/i915: Allocate intel_engine_cs structure only for the enabled engines
With the possibility of addition of many more number of rings in future,
the drm_i915_private structure could bloat as an array, of type
intel_engine_cs, is embedded inside it.
	struct intel_engine_cs engine[I915_NUM_ENGINES];
Though this is still fine as generally there is only a single instance of
drm_i915_private structure used, but not all of the possible rings would be
enabled or active on most of the platforms. Some memory can be saved by
allocating intel_engine_cs structure only for the enabled/active engines.
Currently the engine/ring ID is kept static and dev_priv->engine[] is simply
indexed using the enums defined in intel_engine_id.
To save memory and continue using the static engine/ring IDs, 'engine' is
defined as an array of pointers.
	struct intel_engine_cs *engine[I915_NUM_ENGINES];
dev_priv->engine[engine_ID] will be NULL for disabled engine instances.

There is a text size reduction of 928 bytes, from 1028200 to 1027272, for
i915.o file (but for i915.ko file text size remain same as 1193131 bytes).

v2:
- Remove the engine iterator field added in drm_i915_private structure,
  instead pass a local iterator variable to the for_each_engine**
  macros. (Chris)
- Do away with intel_engine_initialized() and instead directly use the
  NULL pointer check on engine pointer. (Chris)

v3:
- Remove for_each_engine_id() macro, as the updated macro for_each_engine()
  can be used in place of it. (Chris)
- Protect the access to Render engine Fault register with a NULL check, as
  engine specific init is done later in Driver load sequence.

v4:
- Use !!dev_priv->engine[VCS] style for the engine check in getparam. (Chris)
- Kill the superfluous init_engine_lists().

v5:
- Cleanup the intel_engines_init() & intel_engines_setup(), with respect to
  allocation of intel_engine_cs structure. (Chris)

v6:
- Rebase.

v7:
- Optimize the for_each_engine_masked() macro. (Chris)
- Change the type of 'iter' local variable to enum intel_engine_id. (Chris)
- Rebase.

v8: Rebase.

v9: Rebase.

v10:
- For index calculation use engine ID instead of pointer based arithmetic in
  intel_engine_sync_index() as engine pointers are not contiguous now (Chris)
- For appropriateness, rename local enum variable 'iter' to 'id'. (Joonas)
- Use for_each_engine macro for cleanup in intel_engines_init() and remove
  check for NULL engine pointer in cleanup() routines. (Joonas)

v11: Rebase.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1476378888-7372-1-git-send-email-akash.goel@intel.com
2016-10-14 09:58:43 +01:00
Chris Wilson
95374d759a drm/i915: Always use the GTT for error capture
Since the GTT provides universal access to any GPU page, we can use it
to reduce our plethora of read methods to just one. It also has the
important characteristic of being exactly what the GPU sees - if there
are incoherency problems, seeing the batch as executed (rather than as
trapped inside the cpu cache) is important.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161012090522.367-4-chris@chris-wilson.co.uk
2016-10-12 12:00:32 +01:00
Matthew Auld
82daabae9e drm/i915: remove writeq ifdeffery
drm already provides fallback versions of readq and writeq.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1473451373-9852-1-git-send-email-matthew.auld@intel.com
2016-09-12 11:33:56 +03:00
Chris Wilson
fbb30a5c46 drm/i915: Flush to GTT domain all GGTT bound objects after hibernation
Recently I have been applying an optimisation to avoid stalling and
clflushing GGTT objects based on their current binding. That is we only
set-to-gtt-domain upon first bind. However, on hibernation the objects
remain bound, but they are in the CPU domain. Currently (since commit
975f7ff42e ("drm/i915: Lazily migrate the objects after hibernation"))
we only flush scanout objects as all other objects are expected to be
flushed prior to use. That breaks down in the face of the runtime
optimisation above - and we need to flush all GGTT pinned objects
(essentially ringbuffers).

To reduce the burden of extra clflushes, we only flush those objects we
cannot discard from the GGTT. Everything pinned to the scanout, or
current contexts or ringbuffers will be flushed and rebound. Other
objects, such as inactive contexts, will be left unbound and in the CPU
domain until first use after resuming.

Fixes: 7abc98fadf ("drm/i915: Only change the context object's domain...")
Fixes: 57e8853181 ("drm/i915: Use VMA for ringbuffer tracking")
References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909201957.2499-1-chris@chris-wilson.co.uk
2016-09-09 21:31:43 +01:00
Chris Wilson
22dd3bb919 drm/i915: Mark up all locked waiters
In the next patch we want to handle reset directly by a locked waiter in
order to avoid issues with returning before the reset is handled. To
handle the reset, we must first know whether we hold the struct_mutex.
If we do not hold the struct_mtuex we can not perform the reset, but we do
not block the reset worker either (and so we can just continue to wait for
request completion) - otherwise we must relinquish the mutex.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-10-chris@chris-wilson.co.uk
2016-09-09 14:23:03 +01:00
Chris Wilson
ea746f3659 drm/i915: Expand bool interruptible to pass flags to i915_wait_request()
We need finer control over wakeup behaviour during i915_wait_request(),
so expand the current bool interruptible to a bitmask.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160909131201.16673-9-chris@chris-wilson.co.uk
2016-09-09 14:23:03 +01:00
Zhi Wang
e320d40022 drm/i915: disable 48bit full PPGTT when vGPU is active
Disable 48bit full PPGTT on vGPU too for now.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160906040412.1274-3-zhenyuw@linux.intel.com
2016-09-06 16:40:00 +03:00
Chris Wilson
bb8f9cffad drm/i915: Allow DMA pagetables to use highmem
As we never need to directly access the pages we allocate for scratch and
the pagetables, and always remap them into the GTT through the dma
remapper, we do not need to limit the allocations to lowmem i.e. we can
pass in the __GFP_HIGHMEM flag to the page allocation.

For backwards compatibility, e.g. certain old GPUs not liking highmem
for certain functions that may be accidentally mapped to the scratch
page by userspace, keep the GMCH probe as only allocating from DMA32.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-3-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-22 12:19:52 +01:00
Chris Wilson
8bcdd0f756 drm/i915: Embed the scratch page struct into each VM
As the scratch page is no longer shared between all VM, and each has
their own, forgo the small allocation and simply embed the scratch page
struct into the i915_address_space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-2-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-08-22 12:19:52 +01:00
David Weinehall
52a05c302b drm/i915: pdev cleanup
In an effort to simplify things for a future push of dev_priv instead
of dev wherever possible, always take pdev via dev_priv where
feasible, eliminating the direct access from dev. Right now this
only eliminates a few cases of dev, but it also obviates that we pass
dev into a lot of functions where dev_priv would be the more obvious
choice.

v2: Fixed one more place missing in the previous patch set

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-5-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
David Weinehall
c49d13ee13 drm/i915: consistent struct device naming
We currently have a mix of struct device *device, struct device *kdev,
and struct device *dev (the latter forcing us to refer to
struct drm_device as something else than the normal dev).

To simplify things, always use kdev when referring to struct device.

v2: Replace the dev_to_drm_minor() macro with the inline function
    kdev_to_drm_minor().

Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822103245.24069-3-david.weinehall@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-08-22 12:19:52 +01:00
Chris Wilson
14daa63b27 drm/i915: Stop marking the unaccessible scratch page as UC
Since by design, if not entirely by practice, nothing is allowed to
access the scratch page we use to background fill the VM, then we do not
need to ensure that it is coherent between the CPU and GPU.
set_pages_uc() does a stop_machine() after changing the PAT, and that
significantly impacts upon context creation throughput.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20160822074431.26872-1-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-08-22 11:01:47 +01:00
Chris Wilson
f7bbe7883c drm/i915: Embed the io-mapping struct inside drm_i915_private
As io_mapping.h now always allocates the struct, we can avoid that
allocation and extra pointer dance by embedding the struct inside
drm_i915_private

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160819155428.1670-5-chris@chris-wilson.co.uk
2016-08-19 17:13:35 +01:00
Chris Wilson
49ef5294cd drm/i915: Move fence tracking from object to vma
In order to handle tiled partial GTT mmappings, we need to associate the
fence with an individual vma.

v2: A couple of silly drops replaced spotted by Joonas

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-21-chris@chris-wilson.co.uk
2016-08-18 22:36:50 +01:00
Chris Wilson
05a20d098d drm/i915: Move map-and-fenceable tracking to the VMA
By moving map-and-fenceable tracking from the object to the VMA, we gain
fine-grained tracking and the ability to track individual fences on the VMA
(subsequent patch).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160818161718.27187-16-chris@chris-wilson.co.uk
2016-08-18 22:36:48 +01:00
Chris Wilson
058d88c433 drm/i915: Track pinned VMA
Treat the VMA as the primary struct responsible for tracking bindings
into the GPU's VM. That is we want to treat the VMA returned after we
pin an object into the VM as the cookie we hold and eventually release
when unpinning. Doing so eliminates the ambiguity in pinning the object
and then searching for the relevant pin later.

v2: Joonas' stylistic nitpicks, a fun rebase.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-27-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:13 +01:00
Chris Wilson
19880c4a3f drm/i915: Consolidate i915_vma_unpin_and_release()
In a few places, we repeat a call to clear a pointer to a vma whilst
unpinning and releasing a reference to its owner. Refactor those into a
common function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-26-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:12 +01:00
Chris Wilson
e5cdb22b27 drm/i915: Move assertion for iomap access to i915_vma_pin_iomap
Access through the GTT requires the device to be awake. Ideally
i915_vma_pin_iomap() is short-lived and the pinning demarcates the
access through the iomap. This is not entirely true, we have a mixture
of long lived pins that exceed the wakelock (such as legacy ringbuffers)
and short lived pin that do live within the wakelock (such as execlist
ringbuffers).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-17-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:01:04 +01:00
Chris Wilson
81a8aa4a6c drm/i915: Create a VMA for an object
In many places, we wish to store the VMA in preference to the object
itself and so being able to create the persistent VMA is useful.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-9-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:55 +01:00
Chris Wilson
247177ddd5 drm/i915: Always set the vma->pages
Previously, we would only set the vma->pages pointer for GGTT entries.
However, if we always set it, we can use it to prettify some code that
may want to access the backing store associated with the VMA (as
assigned to the VMA).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471254551-25805-8-git-send-email-chris@chris-wilson.co.uk
2016-08-15 11:00:54 +01:00
Ville Syrjälä
6687c9062c drm/i915: Rewrite fb rotation GTT handling
Redo the fb rotation handling in order to:
- eliminate the NV12 special casing
- handle fb->offsets[] properly
- make the rotation handling easier for the plane code

To achieve these goals we reduce intel_rotation_info to only contain
(for each plane) the rotated view width,height,stride in tile units,
and the page offset into the object where the plane starts. Each plane
is handled exactly the same way, no special casing for NV12 or other
formats. We then store the computed rotation_info under
intel_framebuffer so that we don't have to recompute it again.

To handle fb->offsets[] we treat them as a linear offsets and convert
them to x/y offsets from the start of the relevant GTT mapping (either
normal or rotated). We store the x/y offsets under intel_framebuffer,
and for some extra convenience we also store the rotated pitch (ie.
tile aligned plane height). So for each plane we have the normal
x/y offsets, rotated x/y offsets, and the rotated pitch. The normal
pitch is available already in fb->pitches[].

While we're gathering up all that extra information, we can also easily
compute the storage requirements for the framebuffer, so that we can
check that the object is big enough to hold it.

When it comes time to deal with the plane source coordinates, we first
rotate the clipped src coordinates to match the relevant GTT view
orientation, then add to them the fb x/y offsets. Next we compute
the aligned surface page offset, and as a result we're left with some
residual x/y offsets. Finally, if required by the hardware, we convert
the remaining x/y offsets into a linear offset.

For gen2/3 we simply skip computing the final page offset, and just
convert the src+fb x/y offsets directly into a linear offset since
that's what the hardware wants.

After this all platforms, incluing SKL+, compute these things in exactly
the same way (excluding alignemnt differences).

v2: Use BIT(DRM_ROTATE_270) instead of ROTATE_270 when rotating
    plane src coordinates
    Drop some spurious changes that got left behind during
    development
v3: Split out more changes to prep patches (Daniel)
    s/intel_fb->plane[].foo.bar/intel_fb->foo[].bar/ for brevity
    Rename intel_surf_gtt_offset to intel_fb_gtt_offset
    Kill the pointless 'plane' parameter from intel_fb_gtt_offset()
v4: Fix alignment vs. alignment-1 when calling
    _intel_compute_tile_offset() from intel_fill_fb_info()
    Pass the pitch in tiles in
    stad of pixels to intel_adjust_tile_offset() from intel_fill_fb_info()
    Pass the full width/height of the rotated area to
    drm_rect_rotate() for clarity
    Use u32 for more offsets
v5: Preserve the upper_32_bits()/lower_32_bits() handling for the
    fb ggtt offset (Sivakumar)
v6: Rebase due to drm_plane_state src/dst rects

Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470821001-25272-2-git-send-email-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-08-11 18:32:46 +03:00
Matthew Auld
cb7f27601c drm/i915: fix aliasing_ppgtt leak
In i915_ggtt_cleanup_hw we need to remember to free aliasing_ppgtt. This
fixes the following kmemleak message:

unreferenced object 0xffff880213cca000 (size 8192):
  comm "modprobe", pid 1298, jiffies 4294745402 (age 703.930s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff817c808e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff8121f9c2>] kmem_cache_alloc_trace+0x142/0x1d0
    [<ffffffffa06d11ef>] i915_gem_init_ggtt+0x10f/0x210 [i915]
    [<ffffffffa06d71bb>] i915_gem_init+0x5b/0xd0 [i915]
    [<ffffffffa069749a>] i915_driver_load+0x97a/0x1460 [i915]
    [<ffffffffa06a26ef>] i915_pci_probe+0x4f/0x70 [i915]
    [<ffffffff81423015>] local_pci_probe+0x45/0xa0
    [<ffffffff81424463>] pci_device_probe+0x103/0x150
    [<ffffffff81515e6c>] driver_probe_device+0x22c/0x440
    [<ffffffff81516151>] __driver_attach+0xd1/0xf0
    [<ffffffff8151379c>] bus_for_each_dev+0x6c/0xc0
    [<ffffffff8151555e>] driver_attach+0x1e/0x20
    [<ffffffff81514fa3>] bus_add_driver+0x1c3/0x280
    [<ffffffff81516aa0>] driver_register+0x60/0xe0
    [<ffffffff8142297c>] __pci_register_driver+0x4c/0x50
    [<ffffffffa013605b>] 0xffffffffa013605b

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: b18b6bde30 ("drm/i915/bdw: Free PPGTT struct")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1470420280-21417-1-git-send-email-matthew.auld@intel.com
2016-08-05 22:39:32 +02:00
Chris Wilson
307dc25bf6 drm/i915: Simplify do_idling() (Ironlake vt-d w/a)
Now that we pass along the expected interruptible nature for the
wait-for-idle, we do not need to modify the global
i915->mm.interruptible for this single call. (Only the immediate call to
i915_gem_wait_for_idle() takes the interruptible status as the other
action, dma_map_sg(), is independent of i915.ko)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-7-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson
dcff85c844 drm/i915: Enable i915_gem_wait_for_idle() without holding struct_mutex
The principal motivation for this was to try and eliminate the
struct_mutex from i915_gem_suspend - but we still need to hold the mutex
current for the i915_gem_context_lost(). (The issue there is that there
may be an indirect lockdep cycle between cpu_hotplug (i.e. suspend) and
struct_mutex via the stop_machine().) For the moment, enabling last
request tracking for the engine, allows us to do busyness checking and
waiting without requiring the struct_mutex - which is useful in its own
right.

As a side-effect of having a robust means for tracking engine busyness,
we can replace our other busyness heuristic, that of comparing against
the last submitted seqno. For paranoid reasons, we have a semi-ordered
check of that seqno inside the hangchecker, which we can now improve to
an ordered check of the engine's busyness (removing a locked xchg in the
process).

v2: Pass along "bool interruptible" as being unlocked we cannot rely on
i915->mm.interruptible being stable or even under our control.
v3: Replace check Ironlake i915_gpu_busy() with the common precalculated value

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470388464-28458-6-git-send-email-chris@chris-wilson.co.uk
2016-08-05 10:54:37 +01:00
Chris Wilson
de895082f7 drm/i915: Remove highly confusing i915_gem_obj_ggtt_pin()
Since i915_gem_obj_ggtt_pin() is an idiom breaking curry function for
i915_gem_object_ggtt_pin(), spare us the confusion and remove it.
Removing it now simplifies later patches to change the i915_vma_pin()
(and friends) interface.

v2: Add a redundant GEM_BUG_ON(!view) to
i915_gem_obj_lookup_or_create_ggtt_vma()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-18-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:20:00 +01:00
Chris Wilson
3272db5313 drm/i915: Combine all i915_vma bitfields into a single set of flags
In preparation to perform some magic to speed up i915_vma_pin(), which
is among the hottest of hot paths in execbuf, refactor all the bitfields
accessed by i915_vma_pin() into a single unified set of flags.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-16-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:59 +01:00
Chris Wilson
59bfa1248e drm/i915: Start passing around i915_vma from execbuffer
During execbuffer we look up the i915_vma in order to reserve them in
the VM. However, we then do a double lookup of the vma in order to then
pin them, all because we lack the necessary interfaces to operate on
i915_vma - so introduce i915_vma_pin()!

v2: Tidy parameter lists to remove one level of redirection in the hot
path.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-15-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:58 +01:00
Chris Wilson
20dfbde463 drm/i915: Wrap vma->pin_count accessors with small inline helpers
In the next few patches, the VMA pinning API is overhauled and to reduce
the churn we pull out the update to the accessors into a prep patch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-14-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:58 +01:00
Chris Wilson
de18003328 drm/i915: Record allocated vma size
Tracking the size of the VMA as allocated allows us to dramatically
reduce the complexity of later functions (like inserting the VMA in to
the drm_mm range manager).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-13-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:57 +01:00
Chris Wilson
e522ac2324 drm/i915: Remove surplus drm_device parameter to i915_gem_evict_something()
Eviction is VM local, so we can ignore the significance of the
drm_device in the caller, and leave it to i915_gem_evict_something() to
manage itself.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470324762-2545-2-git-send-email-chris@chris-wilson.co.uk
2016-08-04 20:19:50 +01:00
Chris Wilson
df0e9a287d Revert "drm/i915: Clean up associated VMAs on context destruction"
This reverts commit e9f24d5fb7.

The patch was only a stop-gap measure that fixed half the problem - the
leak of the fbcon when restarting X. A complete solution required
releasing the VMA when the object itself was closed rather than rely on
file/process exit. The previous patches add the VMA tracking necessary
to do close them along with the object, context or file, and so the time
has come to remove the partial fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-28-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:34 +01:00
Chris Wilson
50e046b6a0 drm/i915: Mark the context and address space as closed
When the user closes the context mark it and the dependent address space
as closed. As we use an asynchronous destruct method, this has two
purposes.  First it allows us to flag the closed context and detect
internal errors if we to create any new objects for it (as it is removed
from the user's namespace, these should be internal bugs only). And
secondly, it allows us to immediately reap stale vma.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-27-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:33 +01:00
Chris Wilson
b1f788c6ac drm/i915: Release vma when the handle is closed
In order to prevent a leak of the vma on shared objects, we need to
hook into the object_close callback to destroy the vma on the object for
this file. However, if we destroyed that vma immediately we may cause
unexpected application stalls as we try to unbind a busy vma - hence we
defer the unbind to when we retire the vma.

v2: Keep vma allocated until closed. This is useful for a later
optimisation, but it is required now in order to handle potential
recursion of i915_vma_unbind() by retiring itself.
v3: Comments are important.

Testcase: igt/gem_ppggtt/flink-and-close-vma-leak
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-26-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:33 +01:00
Chris Wilson
b0decaf75b drm/i915: Track active vma requests
Hook the vma itself into the i915_gem_request_retire() so that we can
accurately track when a solitary vma is inactive (as opposed to having
to wait for the entire object to be idle). This improves the interaction
when using multiple contexts (with full-ppgtt) and eliminates some
frequent list walking when retiring objects after a completed request.

A side-effect is that we get an active vma reference for free. The
consequence of this is shown in the next patch...

v2: Update inline names to be consistent with
i915_gem_object_get_active()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-25-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:32 +01:00
Chris Wilson
2bfa996e03 drm/i915: Store owning file on the i915_address_space
For the global GTT (and aliasing GTT), the address space is owned by the
device (it is a global resource) and so the per-file owner field is
NULL. For per-process GTT (where we create an address space per
context), each is owned by the opening file. We can use this ownership
information to both distinguish GGTT and ppGTT address spaces, as well
as occasionally inspect the owner.

v2: Whitespace, tells us who owns i915_address_space

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-6-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:17 +01:00
Chris Wilson
34c998b4eb drm/i915: Rearrange GGTT probing to avoid needing a vfunc
Since we have a static if-else-chain for device probing of the global
GTT, we do not need to use a function pointer, let alone store it when
we never use it again. So use the if-else-chain to call down into the
device specific probe.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-5-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:16 +01:00
Chris Wilson
f6b9d5cabd drm/i915: Split early global GTT initialisation
Initialising the global GTT is tricky as we wish to use the drm_mm range
manager during the modesetting initialisation (to capture stolen
allocations from the BIOS) before we actually enable GEM. To overcome
this, we currently setup the drm_mm first and then carefully rebind
them.

v2: Fixup after rebasing
v3: GGTT initialisation needs to be split around kicking out conflicts
v4: Restore an old UMS BUG_ON(mappable > total) as a DRM_ERROR plus
fixup of probe results.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-4-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:15 +01:00
Chris Wilson
97d6d7ab68 drm/i915: Update GGTT initialisation functions to take drm_i915_private
Since these are internal functions they operate on drm_i915_private and
not the drm_device being passed in. So pass in the drm_i915_private
instead, and remove one layer of dancing. No space wins here, just
conforming to the norm in function parameters.

v2: Include all the probe functions

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-3-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:14 +01:00
Chris Wilson
0088e522dd drm/i915: Split GGTT initialisation between probing and setup
In order to handle conflicting drivers (i.e. vgacon) having a different
setup of hardware, we have to remove those other drivers before we try
to setup our own mappings. This requires us to split GGTT initialisation
between probing for the hardware location (part of the PCI BAR) and
later establishing the kernel resources for it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470293567-10811-2-git-send-email-chris@chris-wilson.co.uk
2016-08-04 08:09:13 +01:00
Chris Wilson
7c9cf4e33a drm/i915: Reduce engine->emit_flush() to a single mode parameter
Rather than passing a complete set of GPU cache domains for either
invalidation or for flushing, or even both, just pass a single parameter
to the engine->emit_flush to determine the required operations.

engine->emit_flush(GPU, 0) -> engine->emit_flush(EMIT_INVALIDATE)
engine->emit_flush(0, GPU) -> engine->emit_flush(EMIT_FLUSH)
engine->emit_flush(GPU, GPU) -> engine->emit_flush(EMIT_FLUSH | EMIT_INVALIDATE)

This allows us to extend the behaviour easily in future, for example if
we want just a command barrier without the overhead of flushing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-8-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:19 +01:00
Chris Wilson
c7fe7d25ed drm/i915: Remove obsolete engine->gpu_caches_dirty
Space for flushing the GPU cache prior to completing the request is
preallocated and so cannot fail - the GPU caches will always be flushed
along with the completed request. This means we no longer have to track
whether the GPU cache is dirty between batches like we had to with the
outstanding_lazy_seqno.

With the removal of the duplication in the per-backend entry points for
emitting the obsolete lazy flush, we can then further unify the
engine->emit_flush.

v2: Expand a bit on the legacy of gpu_caches_dirty

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-18-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-7-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:19 +01:00
Chris Wilson
7e37f889b5 drm/i915: Rename struct intel_ringbuffer to struct intel_ring
The state stored in this struct is not only the information about the
buffer object, but the ring used to communicate with the hardware. Using
buffer here is overly specific and, for me at least, conflates with the
notion of buffer objects themselves.

s/struct intel_ringbuffer/struct intel_ring/
s/enum intel_ring_hangcheck/enum intel_engine_hangcheck/
s/describe_ctx_ringbuf()/describe_ctx_ring()/
s/intel_ring_get_active_head()/intel_engine_get_active_head()/
s/intel_ring_sync_index()/intel_engine_sync_index()/
s/intel_ring_init_seqno()/intel_engine_init_seqno()/
s/ring_stuck()/engine_stuck()/
s/intel_cleanup_engine()/intel_engine_cleanup()/
s/intel_stop_engine()/intel_engine_stop()/
s/intel_pin_and_map_ringbuffer_obj()/intel_pin_and_map_ring()/
s/intel_unpin_ringbuffer()/intel_unpin_ring()/
s/intel_engine_create_ringbuffer()/intel_engine_create_ring()/
s/intel_ring_flush_all_caches()/intel_engine_flush_all_caches()/
s/intel_ring_invalidate_all_caches()/intel_engine_invalidate_all_caches()/
s/intel_ringbuffer_free()/intel_ring_free()/

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-15-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-4-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:16 +01:00
Chris Wilson
1dae2dfb0b drm/i915: Rename request->ringbuf to request->ring
Now that we have disambuigated ring and engine, we can use the clearer
and more consistent name for the intel_ringbuffer pointer in the
request.

@@
struct drm_i915_gem_request *r;
@@
- r->ringbuf
+ r->ring

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-12-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-2-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:15 +01:00
Chris Wilson
b5321f309b drm/i915: Unify intel_logical_ring_emit and intel_ring_emit
Both perform the same actions with more or less indirection, so just
unify the code.

v2: Add back a few intel_engine_cs locals

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-11-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1470174640-18242-1-git-send-email-chris@chris-wilson.co.uk
2016-08-02 22:58:13 +01:00
Chris Wilson
2a1d775201 drm/i915: Prefer list_first_entry_or_null
list_first_entry_or_null() can generate better code than using
if (!list_empty()) {ptr = list_first_entry()) ..., so put it to use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1469432687-22756-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1469530913-17180-2-git-send-email-chris@chris-wilson.co.uk
2016-07-26 13:00:58 +01:00
Chris Wilson
406ea8d22f drm/i915: Treat ringbuffer writes as write to normal memory
Ringbuffers are now being written to either through LLC or WC paths, so
treating them as simply iomem is no longer adequate. However, for the
older !llc hardware, the hardware is documentated as treating the TAIL
register update as serialising, so we can relax the barriers when filling
the rings (but even if it were not, it is still an uncached register write
and so serialising anyway.).

For simplicity, let's ignore the iomem annotation.

v2: Remove iomem from ringbuffer->virtual_address
v3: And for good measure add iomem elsewhere to keep sparse happy

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v2
Link: http://patchwork.freedesktop.org/patch/msgid/1469005202-9659-8-git-send-email-chris@chris-wilson.co.uk
Link: http://patchwork.freedesktop.org/patch/msgid/1469017917-15134-7-git-send-email-chris@chris-wilson.co.uk
2016-07-20 13:40:14 +01:00
Chris Wilson
48f112fed3 drm/i915: Fill unused GGTT with scratch pages for VT-d
One of the numerous VT-d workarounds we require is that the display
hardware reads past the end of the buffer triggering VT-d faults. This
is acknowledged in the code as being safe "since we fill the unused
portions of the GGTT with the scratch page". Alas, that is no longer
always true and so we trigger DMAR read faults.

Skylake also requires another workaround to avoid mixing VT-d and
unpopulated PTE, and so there we also need to ensure we fill unused
entries with the scratch page.

Reported-by: Mike Lothian <mike@fireburn.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96584
Fixes: f7770bfd9f ("drm/i915: Skip clearing the GGTT on full-ppgtt systems")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466773634-8106-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: David Weinehall <david.weinehall@intel.com>
2016-07-08 13:36:27 +01:00
Chris Wilson
91c8a326a1 drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.

   text	   data	    bss	    dec	    hex	filename
1068757	   4565	    416	1073738	 10624a	drivers/gpu/drm/i915/i915.ko
1066949	   4565	    416	1071930	 105b3a	drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)

and for good measure the dev_priv->dev backpointer was removed entirely.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
2016-07-05 11:58:45 +01:00
Chris Wilson
8eb95204d9 drm/i915: Amalgamate gen6_mm_switch() and vgpu_mm_switch()
These are identical, so let's just use the same vfunc.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-07-05 11:24:59 +01:00
Chris Wilson
fac5e23e3c drm/i915: Mass convert dev->dev_private to to_i915(dev)
Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().

   text    data     bss     dec     hex filename
1073824    4562     416 1078802  107612 drivers/gpu/drm/i915/i915.ko
1068976    4562     416 1073954  106322 drivers/gpu/drm/i915/i915.ko

Created by the coccinelle script:

@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
2016-07-04 12:54:07 +01:00
Dave Gordon
731f74c5c6 drm/i915: tweak gen6_for_{each_pde, all_pdes} macros
Gen8 versions of these macros were updated a few months ago
(e8ebd8e drm/i915: eliminate 'temp' in gen8_for_each macros)
originally because at least one iterator could generate an
out of bounds access, but also because eliminating the 'temp'
parameter generated smaller and faster code.

Matthew Auld recently noticed the same problem with the gen6
versions and provided a patch
https://lists.freedesktop.org/archives/intel-gfx/2016-June/099334.html
but while we're changing these, we might as well make them as
much like the gen8 versions as possible, including the style
of using "&& (..., true)" rather than ": (..., 1) : 0", and
of course eliminating the redundant 'temp'.

Furthermore, the "all_pdes" version is only used in one place,
so we can improve code efficiency by changing both the macro
parameters and the calling code to reduce extra dereferences.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466793466-23500-1-git-send-email-david.s.gordon@intel.com
2016-06-27 13:13:53 +01:00
Chris Wilson
6e5a5beb8e drm/i915: Split idling from forcing context switch
We only need to force a switch to the kernel context placeholder during
eviction. All other uses of i915_gpu_idle() just want to wait until
existing work on the GPU is idle. Rename i915_gpu_idle() to
i915_gem_wait_for_idle() to avoid any implications about "parking" the
context first.

v2: Tweak an error message if the wait fails for the ilk vtd w/a

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466776558-21516-6-git-send-email-chris@chris-wilson.co.uk
2016-06-24 15:03:14 +01:00
Zhi Wang
b02d22a399 drm/i915: Fold vGPU active check into inner functions
v5:
- Let functions take struct drm_i915_private *. (Tvrtko)

- Fold vGPU related active check into the inner functions. (Kevin)

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Kevin Tian <kevin.tian@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466078825-6662-4-git-send-email-zhi.a.wang@intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-06-17 19:44:29 +01:00
Chris Wilson
d6473f5664 drm/i915: Add support for mapping an object page by page
Introduced a new vm specfic callback insert_page() to program a single pte in
ggtt or ppgtt. This allows us to map a single page in to the mappable aperture
space. This can be iterated over to access the whole object by using space as
meagre as page size.

v2: Added low level rpm assertions to insert_page routines (Chris)

v3: Added POSTING_READ post register write (Tvrtko)

v4: Rebase (Ankit)

v5: Removed wmb() and FLUSH_CTL from insert_page, caller to take care
of it (Chris)

v6: insert_page not working correctly without FLSH_CNTL write, added the
write again.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-06-13 10:03:54 +01:00
Dave Gordon
85d1225ec0 drm/i915: Introduce & use new lightweight SGL iterators
The existing for_each_sg_page() iterator is somewhat heavyweight, and is
limiting i915 driver performance in a few benchmarks. So here we
introduce somewhat lighter weight iterators, primarily for use with GEM
objects or other case where we need only deal with whole aligned pages.

Unlike the old iterator, the new iterators use an internal state
structure which is not intended to be accessed by the caller; instead
each takes as a parameter an output variable which is set before each
iteration. This makes them particularly simple to use :)

One of the new iterators provides the caller with the DMA address of
each page in turn; the other provides the 'struct page' pointer required
by many memory management operations.

Various uses of for_each_sg_page() are then converted to the new macros.

v2: Force inlining of the sg_iter constructor and make the union
anonymous.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1463741647-15666-4-git-send-email-chris@chris-wilson.co.uk
2016-05-20 13:43:00 +01:00
Chris Wilson
f7770bfd9f drm/i915: Skip clearing the GGTT on full-ppgtt systems
Under full-ppgtt, access to the global GTT is carefully regulated
through hardware functions (i.e. userspace cannot read and write to
arbitrary locations in the GGTT via the GPU). With this restriction in
place, we can forgo clearing stale entries from the GGTT as they will
not be accessed.

For aliasing-ppgtt, we could almost do the same except that we do allow
userspace access to the global-GTT via execbuf in order to workraound
some quirks of certain instructions. (This execbuf path is filtered out
with EINVAL on full-ppgtt.)

The most dramatic effect this will have will be during resume, as with
full-ppgtt the GGTT is only used sparingly.

References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-4-git-send-email-chris@chris-wilson.co.uk
2016-05-14 08:51:39 +01:00
Chris Wilson
975f7ff42e drm/i915: Lazily migrate the objects after hibernation
Now that we mark the object domains for having been restored from the
hibernation image, we not need to flush everything during resume and
can instead rely on the normal domain tracking to flush only when
required. The only caveat here are objects that are pinned for use by
the hardware, whose contents must be coherent for when the device
resumes reading from then (shortly afterwards with the driver assuming
the objects are in the correct domain).

References: https://bugs.freedesktop.org/show_bug.cgi?id=94722
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: David Weinehall <david.weinehall@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Tested-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463207195-22076-3-git-send-email-chris@chris-wilson.co.uk
2016-05-14 08:51:39 +01:00
Chris Wilson
dc97997a21 drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c
Pass drm_i915_private to the uncore init/fini routines and their
subservients as it is their native type.

   text    data     bss     dec     hex filename
6309978 3578778  696320 10585076         a183f4 vmlinux
6309530 3578778  696320 10584628         a18234 vmlinux

a modest 400 bytes of saving, but 60 lines of code deleted!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk
2016-05-10 17:20:20 +01:00
Ville Syrjälä
ac840ae535 drm/i915: Re-enable GGTT earlier during resume on pre-gen6 platforms
Move the intel_enable_gtt() call to happen before we touch the GTT
during resume. Right now it's done way too late. Before
commit ebb7c78d35 ("agp/intel-gtt: Only register fake agp driver for gen1")
it was actually done earlier on account of also getting called from
the resume hook of the fake agp driver. With the fake agp driver
no longer getting registered we must move the call up.

The symptoms I've seen on my 830 machine include lowmem corruption,
other kinds of memory corruption, and straight up hung machine during
or just after resume. Not really sure what causes the memory corruption,
but so far I've not seen any with this fix.

I think we shouldn't really need to call this during init, but we have
been doing that so I've decided to keep the call. However moving that
call earlier could be prudent as well. Doing it right after the
intel-gtt probe seems appropriate.

Also tested this on 946gz,elk,ilk and all seemed quite happy with
this change.

v2: Reorder init_hw vs. enable_hw functions (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: drm-intel-fixes@lists.freedesktop.org
Fixes: ebb7c78d35 ("agp/intel-gtt: Only register fake agp driver for gen1")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462559755-353-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-05-10 12:29:33 +03:00
Chris Wilson
c033666a94 drm/i915: Store a i915 backpointer from engine, and use it
text	   data	    bss	    dec	    hex	filename
6309351	3578714	 696320	10584385	 a18141	vmlinux
6308391	3578714	 696320	10583425	 a17d81	vmlinux

Almost 1KiB of code reduction.

v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions

   text	   data	    bss	    dec	    hex	filename
6304579	3578778	 696320	10579677	 a16edd	vmlinux
6303427	3578778	 696320	10578525	 a16a5d	vmlinux

Now over 1KiB!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk
2016-05-09 13:41:24 +01:00
Chris Wilson
cba6dba4e5 drm/i915: Unexport i915_ppgtt_init()
As i915_ppgtt_init() is not used outside of i915_gem_gtt.c we can make
it static.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1462443767-5194-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
2016-05-05 14:16:39 +01:00
Chris Wilson
0e4ca1008e drm/i915: Fix ordering of sanitize ppgtt and sanitize execlists
The i915.enable_ppgtt option depends upon the state of
i915.enable_execlists option - so we need to sanitize execlists first.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461932305-14637-2-git-send-email-chris@chris-wilson.co.uk
2016-04-29 13:59:05 +01:00
Chris Wilson
f9326be5f1 drm/i915: Rearrange switch_context to load the aliasing ppgtt on first use
The code to switch_mm() is already handled by i915_switch_context(), the
only difference required to setup the aliasing ppgtt is that we need to
emit te switch_mm() on the first context, i.e. when transitioning from
engine->last_context == NULL. This allows us to defer the
initialisation of the GPU from early device initialisation to first use,
which should marginally speed up both. The caveat is that we then defer
the context initialisation until first use - i.e. we cannot assume that
the GPU engines are initialised. For example, this means that power
contexts for rc6 (Ironlake) need to explicitly loaded, as they are.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-11-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
8ef8561f2c drm/i915: Move ioremap_wc tracking onto VMA
By tracking the iomapping on the VMA itself, we can share that area
between multiple users. Also by only revoking the iomapping upon
unbinding from the mappable portion of the GGTT, we can keep that iomap
across multiple invocations (e.g. execlists context pinning).

Note that by moving the iounnmap tracking to the VMA, we actually end up
fixing a leak of the iomapping in intel_fbdev.

v1.5: Rebase prompted by Tvrtko
v2: Drop dev_priv parameter, we can recover the i915_ggtt from the vma.
v3: Move handling of ioremap space exhaustion to vmap_purge and also
allow vmallocs to recover old iomaps. Add Tvrtko's kerneldoc.
v4: Fix a use-after-free in shrinker and rearrange i915_vma_iomap
v5: Back to i915_vm_to_ggtt
v6: Use i915_vma_pin_iomap and i915_vma_unpin_iomap to mark critical
sections and ensure the VMA cannot be reaped whilst mapped.
v7: Move i915_vma_iounmap so that consumers of the API are not tempted,
and add iomem annotations

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-5-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Chris Wilson
ce7fda2e15 drm/i915: Introduce i915_vm_to_ggtt()
In a couple of places, we have an i915_address_space that we know is
really an i915_ggtt that we want to use. Create an inline helper to
convert from the i915_address_space subclass into its container.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461833819-3991-4-git-send-email-chris@chris-wilson.co.uk
2016-04-28 12:17:32 +01:00
Matthew Auld
64c050dbae drm/i915: tidy up gen8_init_scratch
Prefer a goto teardown path to do all the required cleanup.

v2:
(Joonas Lahtinen)
  - remove NULL assignments
  - rename goto labels

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461759565-12086-1-git-send-email-matthew.auld@intel.com
2016-04-27 16:16:56 +03:00
Matthew Auld
df28564d98 drm/i915: use dev_priv directly in gen8_ppgtt_notify_vgt
Remove dev local and use to_i915() in gen8_ppgtt_notify_vgt.

v2: use dev_priv directly for QUESTION_MACROS (Joonas Lahtinen)

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1461323365-21256-1-git-send-email-matthew.auld@intel.com
2016-04-22 15:32:56 +03:00
Matthew Auld
44a7102484 drm/i915: call kunmap_px on pt_vaddr
We need to kunmap pt_vaddr and not pt itself, otherwise we end up
mapping a bunch of pages without ever unmapping them.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: d1c54acd67 ("drm/i915/gtt: Introduce kmap|kunmap for dma page")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460476663-24890-4-git-send-email-matthew.auld@intel.com
2016-04-19 17:24:17 +03:00
Mika Kuoppala
3accaf7e73 drm/i915: Store and use edram capabilities
Store the edram capabilities instead of only the size of
edram. This is preparatory patch to allow edram size calculation
based on edram capability bits for gen9+. With gen9 the
edram is behind llc and is a separate entity. With hsw/bdw
it was more of a victim cache for LLC so the name 'eLLC' might
be warranted. Regardless, rename all mentions of eLLC to EDRAM to
clear the confusion.

v2: return bytes for edram size (Chris)
    s/eLLC/eDRAM in output if we are gen > 8

v3: rebase, INTEL_GEN (Chris)

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-14 12:27:37 +03:00
Chris Wilson
f2a85e1975 drm,i915: Introduce drm_malloc_gfp()
I have instances where I want to use drm_malloc_ab() but with a custom
gfp mask. And with those, where I want a temporary allocation, I want to
try a high-order kmalloc() before using a vmalloc().

So refactor my usage into drm_malloc_gfp().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: dri-devel@lists.freedesktop.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460113874-17366-6-git-send-email-chris@chris-wilson.co.uk
2016-04-11 17:13:10 +01:00
Joonas Lahtinen
2d1fe07340 drm/i915: Do not use {HAS_*, IS_*, INTEL_INFO}(dev_priv->dev)
dev_priv is what the macro works hard to extract, pass it directly.

> sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g'

v2:
- Include all wrapper macros too (Chris)

v3:
- Include sed cmdline (Chris)

v4:
- Break long line
- Rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-04-07 14:50:26 +03:00
Joonas Lahtinen
e5716f5575 drm/i915: Use i915_vm_to_ppgtt instead of manual container_of
Looks much better without container_of everywhere.

v2:
- In i915_gem_restore_gtt_mappings too (Chris)

v3:
- Do not cause WARN by calling on non PPGTT object (Chris)

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-04-07 14:49:42 +03:00
Joonas Lahtinen
72e96d6450 drm/i915: Refer to GGTT {,VM} consistently
Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt",
"vm" or indirectly through other variables like "dev_priv->ggtt.base"
to avoid confusion with the i915_ggtt object itself and PPGTT VMs.

Refer to the GGTT as "ggtt" instead of indirectly through chaining.

As a bonus gets rid of the long-standing i915_obj_to_ggtt vs.
i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt!

v2:
- Added some more after grepping sources with Chris

v3:
- Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm
  (Chris)

v4:
- Convert all dev_priv->ggtt->foo accesses to ggtt->foo.

v5:
- Make patch checker happy

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-31 17:55:43 +03:00
Joonas Lahtinen
d85489d314 drm/i915: Rename GGTT init functions
Rename and document the GGTT init functions to give a better
idea of the context where they are called from.

i915_gem_gtt_init => i915_ggtt_init_hw
i915_gem_init_global_gtt => i915_gem_init_ggtt
i915_global_gtt_cleanup => i915_ggtt_cleanup_hw

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458830866-12578-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-03-30 13:47:18 +03:00
Matthew Auld
ade7daa164 drm/i915: BUG_ON when ggtt_view is NULL
Lets BUG_ON and don't bother with a WARN and returning an error, so we can
remove the need to pollute the code with error handling, after all it is
a programmer error to provide NULL view. Also while we're here remove
redundant NULL ggtt_view check.

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458834860-7898-1-git-send-email-matthew.auld@intel.com
2016-03-30 13:47:18 +03:00
Dave Gordon
b4ac5afc6b drm/i915: replace for_each_engine()
Having provided for_each_engine_id() for cases where the third (id)
argument is useful, we can now replace all the remaining instances with
a simpler version that takes only two parameters. In many cases, this
also allows the elimination of the local variable used in the iterator
(usually 'i').

v2:
    s/dev_priv/(dev_priv__)/ in body of for_each_engine_masked() [Chris Wilson]

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458757194-17783-2-git-send-email-david.s.gordon@intel.com
2016-03-24 14:34:11 +00:00
Chris Wilson
321d178edb drm/i915: Tidy aliasing_gtt_bind_vma()
In commit 0a87871626
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Oct 15 14:23:01 2015 +0200

    drm/i915: restore ggtt double-bind avoidance

we wrote the ggtt_bind_vma() observing a number of cleanups we could do
over the template of aliasing_gtt_bind_vma(). Now let's apply the
cleanups we made there back to the original. The essence is to avoid
redundant variables and assignements, and by doing so make the code
easier to read.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1448015238-24639-1-git-send-email-chris@chris-wilson.co.uk
2016-03-23 18:02:47 +01:00
Chris Wilson
c890e2d531 drm/i915: Codify our assumption that the Global GTT is <= 4GiB
Throughout the code base, we use u32 for offsets into the global GTT. If
we ever see any hardware with a larger GGTT, then we run the real risk
of silent corruption. So test for our assumption up front so that we
have a nice reminder should the time come when it fails.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Rebased and changed 1ull -> 1ULL, cut 80 char line]
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1458290579-27783-1-git-send-email-joonas.lahtinen@linux.intel.com
2016-03-18 15:19:18 +02:00
Joonas Lahtinen
d507d73578 drm/i915/gtt: Clean up GGTT probing code
Use less pointers with the probing code, making it much less confusing
to read.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-18 15:18:15 +02:00
Joonas Lahtinen
62106b4f6b drm/i915: Rename dev_priv->gtt to dev_priv->ggtt
Refer to Global GTT consistently as GGTT, thus rename dev_priv->gtt
to dev_priv->ggtt and struct i915_gtt to struct i915_ggtt.

Fix a couple of whitespace problems while at it.

v2:
- Fix a typo in commit message.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-18 15:18:15 +02:00
Joonas Lahtinen
dc3b04fbf4 drm/i915/gtt: Reference mappable_end variable from pointer
Reference variable value from pointer, not assumed pointer destination.

Since:

commit c44ef60e43
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Thu Jun 25 18:35:05 2015 +0300

    drm/i915/gtt: Allow >= 4GB sizes for vm.

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-03-18 15:18:14 +02:00
Tvrtko Ursulin
39dabecd99 drm/i915: Use shorter route to dev_private where possible
Where we have a request we can use req->i915 directly instead
of going through the engine and device. Coccinelle script:

@@
function f;
identifier r;
@@
f(..., struct drm_i915_gem_request *r, ...)
{
...
- engine->dev->dev_private
+ r->i915
...
}
@@
struct drm_i915_gem_request *req;
@@
(
  req->
- engine->dev->dev_private
+ i915
)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1458219850-21007-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-03-18 09:50:37 +00:00
Tvrtko Ursulin
666796da7a drm/i915: More intel_engine_cs renaming
Some trivial ones, first pass done with Coccinelle:

@@
@@
(
- I915_NUM_RINGS
+ I915_NUM_ENGINES
|
- intel_ring_flag
+ intel_engine_flag
|
- for_each_ring
+ for_each_engine
|
- i915_gem_request_get_ring
+ i915_gem_request_get_engine
|
- intel_ring_idle
+ intel_engine_idle
|
- i915_gem_reset_ring_status
+ i915_gem_reset_engine_status
|
- i915_gem_reset_ring_cleanup
+ i915_gem_reset_engine_cleanup
|
- init_ring_lists
+ init_engine_lists
)

But that didn't fully work so I cleaned it up with:

for f in *.[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done
for f in *.[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done
for f in *.[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done
for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done

v2: Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:24 +00:00
Tvrtko Ursulin
4a570db57c drm/i915: Rename intel_engine_cs struct members
below and a couple manual fixups.

@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs *J;
+ struct intel_engine_cs *engine;
...
}
@@
identifier I, J;
@@
struct I {
...
- struct intel_engine_cs J;
+ struct intel_engine_cs engine;
...
}
@@
struct drm_i915_private *d;
@@
(
- d->ring
+ d->engine
)
@@
struct i915_execbuffer_params *p;
@@
(
- p->ring
+ p->engine
)
@@
struct intel_ringbuffer *r;
@@
(
- r->ring
+ r->engine
)
@@
struct drm_i915_gem_request *req;
@@
(
- req->ring
+ req->engine
)

v2: Script missed the tracepoint code - fixed up by hand.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:17 +00:00
Tvrtko Ursulin
e2f8039147 drm/i915: Rename local struct intel_engine_cs variables
Done by the Coccinelle script below plus a manual
intervention to GEN8_RING_SEMAPHORE_INIT.

@@
expression E;
@@
- struct intel_engine_cs *ring = E;
+ struct intel_engine_cs *engine = E;
<+...
- ring
+ engine
...+>
@@
@@
- struct intel_engine_cs *ring;
+ struct intel_engine_cs *engine;
<+...
- ring
+ engine
...+>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-16 15:33:00 +00:00
Ville Syrjälä
11f20322e0 drm/i915: Move the NULL sg handling out from rotate_pages()
rotate_pages() checks to see if it got called with a NULL sg, and then
goes to extract it from sg->sgl. It always gets called with a NULL sg
for the first plane, so moving the initial 'sg=st->sgl' assignment out
into intel_rotate_fb_obj_pages() seems less special-casey.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-9-git-send-email-ville.syrjala@linux.intel.com
2016-03-01 12:48:09 +02:00
Ville Syrjälä
1663b9d6a2 drm/i915: Reorganize intel_rotation_info
Throw out a bunch of unnecessary stuff from struct intel_rotation_info,
and pull most of the remaining stuff to live under an array of
per-color plane sub-structures.

What still remains outside the sub-structure will be reorgranized later
as well, but that requires more work elsewhere so leave it be for now.

v2: Split the vma size == luma+chroma size fix to prep patch (Daniel)

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v1)
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-8-git-send-email-ville.syrjala@linux.intel.com
2016-03-01 12:48:09 +02:00
Ville Syrjälä
9106cf1747 drm/i915: Account for the size of the chroma plane for the rotated gtt view
The size of the rotated ggtt mapping ought to include the size of the
chroma plane as well. Not a huge deal since we don't expose NV12 (or any
pother planar format for that matter) yet.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Fixes: 89e3e14276 ("drm/i915: Support NV12 in rotated GGTT mapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1455569699-27905-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2016-03-01 12:48:08 +02:00
Chris Wilson
596c592319 drm/i915: Reduce the pointer dance of i915_is_ggtt()
The multiple levels of indirect do nothing but hinder the compiler and
the pointer chasing turns to be quite painful but painless to fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1456484600-11477-1-git-send-email-tvrtko.ursulin@linux.intel.com
2016-02-26 13:15:39 +00:00
Chris Wilson
1c7f4bca5a drm/i915: Rename vma->*_list to *_link for consistency
Elsewhere we have adopted the convention of using '_link' to denote
elements in the list (and '_list' for the actual list_head itself), and
that the name should indicate which list the link belongs to (and
preferrably not just where the link is being stored).

s/vma_link/obj_link/ (we iterate over obj->vma_list)
s/mm_list/vm_link/ (we iterate over vm->[in]active_list)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2016-02-26 13:15:39 +00:00
Tim Gore
d5165ebd52 drm/i915: implement WaIncreaseDefaultTLBEntries
WaIncreaseDefaultTLBEntries increases the number of TLB
entries available for GPGPU workloads and gives significant
( > 10% ) performance gain for some OCL benchmarks.
Put this in a new function that can be a place for
workarounds that are GT related but not required per ring.
This function is called on driver load and also after a
reset and on resume, so it is safe for workarounds that get
clobbered in these situations. This function currently has
just this one workaround.

v2: This was originally split into 3 patches but following
  review feedback was squashed into 1.
  I have not incorporated some style comments from Chris
  Wilson as I felt that after defining and intialising a
  temporary variable and then adding an additional if block
  to only write the register if the temporary variable had
  been set, this didn't really give a net gain.

v3: Resending in the hope that BAT will run

v4: Change subject line to trigger BAT (please!)

Signed-off-by: Tim Gore <tim.gore@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1454586574-2343-1-git-send-email-tim.gore@intel.com
2016-02-04 14:22:31 +00:00
Ville Syrjälä
11d23e6fa1 drm/i915: Pass rotation_info to intel_rotate_fb_obj_pages()
intel_rotate_fb_obj_pages() doens't need the entire gtt view, just the
rotation info suffices.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-4-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-28 20:56:32 +02:00
Ville Syrjälä
871302555b drm/i915: Pass stride to rotate_pages()
Pass stride in addition to width and height to rotate_pages(). For now
width and stride are the same, but once framebuffer offsets enter the
scene that may no longer be the case.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-3-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-28 20:56:14 +02:00
Ville Syrjälä
7723f47dc6 drm/i915: Rename the rotated gtt view member to 'rotated'
Also rename 'rotation_info' to 'rotated' to match the view type exactly,
this should avoid confusion which union members is valid for each view
type.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453316739-13296-2-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-28 20:55:55 +02:00
Imre Deak
a4eba47b25 drm/i915: Move stolen memory initialization earlier during loading
The only device specific dependency of the stolen memory setup is the
MMIO mapping and the stolen memory size. Both are already available in
i915_gtt_init(), so move the stolen initialization to there. The
clean-up code for i915_gtt_init() is in i915_global_gtt_cleanup(), so
move the stolen memory clean-up code there too.

This will be needed by an upcoming patch that needs the details of the
memory we reserve, but the change is also part of our generic goal to
move the initialization of resources with no or little dependencies on
other device specific resources towards the beginning of the init
sequence.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: David Weinehall <david.weinehall@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1453209992-25995-8-git-send-email-imre.deak@intel.com
2016-01-27 17:43:16 +02:00
Ville Syrjälä
2d7f3bdb2c drm/i915: Pass the dma_addr_t array as const to rotate_pages()
rotate_pages() doesn't modify the passed in dma addresses, so make
them const.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-4-git-send-email-ville.syrjala@linux.intel.com
2016-01-15 21:04:27 +02:00
Ville Syrjälä
b5e16987a0 drm/i915: Set i915_ggtt_view_normal type explicitly
Just for clarity set the type for i915_ggtt_view_normal explicitly.

While at it fix the indentation fail for i915_ggtt_view_rotated.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1452777736-4909-3-git-send-email-ville.syrjala@linux.intel.com
2016-01-15 21:04:16 +02:00
Chris Wilson
c140330b5e drm/i915: Move Braswell stop_machine GGTT insertion workaround
There was a silent conflict between

commit 0a87871626
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Oct 15 14:23:01 2015 +0200

    drm/i915: restore ggtt double-bind avoidance

and

commit 5bab6f60cb
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Oct 23 18:43:32 2015 +0100

    drm/i915: Serialise updates to GGTT with access through GGTT on Braswell

thankfully caught by the extra WARN safegaurd in 0a878716. Since we now
override the GGTT insert_pages callback when installing the aliasing
ppgtt, we assert that the callback is the original ggtt routine.
However, on Braswell we now use a different insertion routine to
serialise access through the GGTT with updating the PTE and hence the
conflict. To avoid the conflict, move the custom insertion routine for
Braswell down a level.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1447859979-20107-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: drm-intel-fixes@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-21 11:17:56 +01:00