linux

Author	SHA1	Message	Date
Chris Wilson	149c86e74f	drm/i915: Allocate context objects from stolen As we never expose context objects directly to userspace, we can forgo allocating a first-class GEM object for them and prefer to use the limited resource of reserved/stolen memory for them. Note this means that their initial contents are undefined. However, a downside of using stolen objects for execlists is that we cannot access the physical address directly (thanks MCH!) which prevents their use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:41:24 +02:00
Chris Wilson	d7b9ca2f7a	drm/i915: Remove request->uniq We already assign a unique identifier to every request: seqno. That someone felt like adding a second one without even mentioning why and tweaking ABI smells very fishy. Fixes regression from commit `b3a38998f0` Author: Nick Hoath <nicholas.hoath@intel.com> Date: Thu Feb 19 16:30:47 2015 +0000 drm/i915: Fix a use after free, and unbalanced refcounting v2: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jani Nikula <jani.nikula@intel.com> [danvet: Fixup because different merge order.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:41:11 +02:00
Chris Wilson	423795cbac	drm/i915: Prefer to check for idleness in worker rather than sync-flush Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:37:31 +02:00
Chris Wilson	74cdb337c0	drm/i915: Tidy gen8 IRQ handler Remove some needless variables and parameter passing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:36:13 +02:00
Chris Wilson	cb0d205e0f	drm/i915: Reduce locking in gen8 IRQ handler Similar in vain in reducing the number of unrequired spinlocks used for execlist command submission (where the forcewake is required but manually controlled), we know that the IRQ registers are outside of the powerwell and so we can access them directly. Since we now have direct access exported via I915_READ_FW/I915_WRITE_FW, lets put those to use in the irq handlers as well. In the process, reorder the execlist submission to happen as early as possible. v2: Restrict the untraced register mmio to just the GT path (i.e. the hotpath for execlists) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:33:32 +02:00
Chris Wilson	a6111f7b66	drm/i915: Reduce locking in execlist command submission This eliminates six needless spin lock/unlock pairs when writing out ELSP. v2: Respin with my preferred colour. v3: Mostly back to the original colour Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:31:44 +02:00
Daniel Vetter	19ee66af15	drm/i915: Remove unused variable in intel_lrc.c Already tagged this one and 0-day builder is failing me. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-04-10 10:18:47 +02:00
Chris Wilson	e20d2ab741	drm/i915: Use a separate slab for vmas vma are more frequently allocated than objects and so should equally benefit from having a dedicated slab. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:17:13 +02:00
Chris Wilson	efab6d8dd1	drm/i915: Use a separate slab for requests requests are even more frequently allocated than objects and equally benefit from having a dedicated slab. v2: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 10:17:07 +02:00
Matt Roper	f1e2daea79	drm/i915: Clear crtc atomic flags at beginning of transaction Once we have full atomic modeset, these kind of flags should be in a real intel_crtc_state that's tracked properly. In the meantime, make sure we clear out any old flags at the beginning of a transaction so that we don't wind up seeing leftover flags from old transactions that were checked, but never went to the commit step. At the moment, a failed check or prepare could leave stale flags behind that interfere with the next atomic transaction. v2: Just do a memset; the series this patch was originally part of placed additional fields into the structure that shouldn't be cleared, but that's no longer the case. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 09:37:01 +02:00
Matt Roper	70a101f863	drm/i915: Switch to full atomic helpers for plane updates/disable, take two Switch from our plane update/disable entrypoints to use the full atomic helpers (which generate a top-level atomic transaction) rather than the transitional helpers (which only create/manipulate orphaned plane states independent of a top-level transaction). Various upcoming work (SKL scalers, atomic watermarks, etc.) requires a full atomic transaction to behave properly/cleanly. Last time we tried this, we had to back out the change because we still call the drm_plane vfuncs directly from within our legacy modesetting code. This potentially results in nested atomic transactions, locking collisions, and other failures. To avoid that problem again, we sidestep the issue by calling the transitional helpers directly (rather than through a vfunc) when we're nested inside of other legacy modesetting code. However this does allow legacy SetPlane() ioctl's to process an entire drm_atomic_state transaction, which is important for upcoming patches. Cc: Chandra Konduru <chandra.konduru@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 09:36:54 +02:00
Daniel Vetter	3e1ab4b705	drm/i915: Update DRIVER_DATE to 20150410 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 09:31:40 +02:00
Chris Wilson	595e1eeb26	drm/i915: Remove vestigal DRI1 ring quiescing code After the removal of DRI1, all access to the rings are through requests and so we can always be sure that there is a request to wait upon to free up available space. The fallback code only existed so that we could quiesce the GPU following unmediated access by DRI1. v2: Rebase Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:15 +02:00
Chris Wilson	4bb1bedb28	drm/i915: Use the global runtime-pm wakelock for a busy GPU for execlists When we submit a request to the GPU, we first take the rpm wakelock, and only release it once the GPU has been idle for a small period of time after all requests have been complete. This means that we are sure no new interrupt can arrive whilst we do not hold the rpm wakelock and so can drop the individual get/put around every single request inside execlists. Note: to close one potential issue we should mark the GPU as busy earlier in __i915_add_request. To elaborate: The issue is that we emit the irq signalling sequence before we grab the rpm reference, which means we could miss the resulting interrupt (since that's not set up when suspended). The only bad side effect is a missed interrupt, gt mmio writes automatically wake up the hw itself. But otoh we have an umbrella rpm reference for the entirety of execbuf, as long as that's there we're covered. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Explain a bit more about the add_request issue, which after some irc chatting with Chris turns out to not be an issue really.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:14 +02:00
Chris Wilson	b5eba37283	drm/i915: Use simpler form of spin_lock_irq(execlist_lock) We can use the simpler spinlock form to disable interrupts as we are always outside of an irq/softirq handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:14 +02:00
Ville Syrjälä	90e4f1592b	drm/i915: Fix the VBT child device parsing for BSW Recent BSW VBT has a VBT child device size 37 bytes instead of the 33 bytes our code assumes. This means we fail to parse the VBT and thus fail to detect eDP ports properly and just register them as DP ports instead. Fix it up by using the reported child device size from the VBT instead of assuming it matches out struct defintions. The latest spec I have shows that the child device size should be 36 bytes for rev >= 195, however on my BSW the size is actually 37 bytes. And our current struct definition is 33 bytes. Feels like the entire VBT parses would need to be rewritten to handle changes in the layout better, but for now I've decided to do just the bare minimum to get my eDP port back. Cc: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:14 +02:00
Michel Thierry	a4e0bedca6	drm/i915: Use complete address space in true PPGTT True PPGTT is capable of having a full address space, even if the system has less allocated memory. Note that aliasing PPGTT always aliases the GGTT and thus should remain of the same size. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:13 +02:00
Michel Thierry	d7b2633dba	drm/i915/gen8: Dynamic page table allocations This finishes off the dynamic page tables allocations, in the legacy 3 level style that already exists. Most everything has already been setup to this point, the patch finishes off the enabling by setting the appropriate function pointers. In LRC mode, contexts need to know the PDPs when they are populated. With dynamic page table allocations, these PDPs may not exist yet. Check if PDPs have been allocated and use the scratch page if they do not exist yet. Before submission, update the PDPs in the logic ring context as PDPs have been allocated. v2: Update aliasing/true ppgtt allocate/teardown/clear functions for gen 6 & 7. v3: Rebase. v4: Remove BUG() from ppgtt_unbind_vma, but keep checking that either teardown_va_range or clear_range functions exist (Daniel). v5: Similar to gen6, in init, gen8_ppgtt_clear_range call is only needed for aliasing ppgtt. Zombie tracking was originally added for teardown function and is no longer required. v6: Update err_out case in gen8_alloc_va_range (missed from lastest rebase). v7: Rebase after s/page_tables/page_table/. v8: Updated scratch_pt check after scratch flag was removed in previous patch. v9: Note that lrc mode needs to be updated to support init state without any PDP. v10: Unmap correct page_table in gen8_alloc_va_range's error case, clean-up gen8_aliasing_ppgtt_init (remove duplicated map), and initialize PTs during page table allocation. v11: Squashed LRC enabling commit, otherwise LRC mode would be left broken until it was updated to handle the init case without any PDP. v12: Do not overallocate new_pts bitmap, make alloc_gen8_temp_bitmaps static and don't abuse of inline functions. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:13 +02:00
Michel Thierry	33c8819f1b	drm/i915/gen8: begin bitmap tracking Like with gen6/7, we can enable bitmap tracking with all the preallocations to make sure things actually don't blow up. v2: Rebased to match changes from previous patches. v3: Without teardown logic, rely on used_pdpes and used_pdes when freeing page tables. v4: Rebased after s/page_tables/page_table/. v5: Rebased after page table generalizations. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:12 +02:00
Michel Thierry	e5815a2e05	drm/i915/gen8: Split out mappings When we do dynamic page table allocations for gen8, we'll need to have more control over how and when we map page tables, similar to gen6. In particular, DMA mappings for page directories/tables occur at allocation time. This patch adds the functionality and calls it at init, which should have no functional change. The PDPEs are still a special case for now. We'll need a function for that in the future as well. v2: Handle renamed unmap_and_free_page functions. v3: Updated after teardown_va logic was removed. v4: Rebase after s/page_tables/page_table/. v5: No longer allocate all PDPs in GEN8+ systems with less than 4GB of memory, and update populate_lr_context to handle this new case (proper tracking will be added later in the patch series). v6: Assign lrc page directory pointer addresses using a macro. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:12 +02:00
Michel Thierry	c488dbbaa7	drm/i915: Extract PPGTT param from page_directory alloc This will be useful for when we move to 48b addressing, and the PDP isn't the root of the page table structure. v2: Rebase after changes for Gen8+ systems with less than 4GB of memory. v3: Rebase after Mika's code review. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:11 +02:00
Michel Thierry	09942c656b	drm/i915: num_pd_pages/num_pd_entries isn't useful These values are never quite useful for dynamic allocations of the page tables. Getting rid of them will help prevent later confusion. v2: Updated to use unmap_and_free_pd functions. v3: Updated gen8_ppgtt_free after teardown logic was removed. v4: Rebase after s/page_tables/page_table/. v5: Keep allocating all page directories in GEN8+ systems with less than 4GB of memory. Updated gen6_for_all_pdes. v6: Prevent (harmless) out of range access in gen6_for_all_pdes. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:11 +02:00
Michel Thierry	7cb6d7ac63	drm/i915/gen8: Update pdp switch and point unused PDPs to scratch page One important part of this patch is we now write a scratch page directory into any unused PDP descriptors. This matters for 2 reasons, first, we're not allowed to just use 0, or an invalid pointer, and second, we must wipe out any previous contents from the last context. The latter point only matters with full PPGTT. The former point only effect platforms with less than 4GB memory. v2: Updated commit message to point that we must set unused PDPs to the scratch page. v3: Unmap scratch_pd in gen8_ppgtt_free. v4: Initialize scratch_pd. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:10 +02:00
Michel Thierry	5441f0cbe1	drm/i915/gen8: pagetable allocation rework Start using gen8_for_each_pde macro to allocate page tables. v2: teardown_va_range references removed. v3: Rebase after s/page_tables/page_table/. v4: Keep setting up page tables for all page directories in systems with less than 4GB of memory. v5: Also initialize the page tables. (Mika) v6: Initialize all page tables, including the extra ones from systems with less than 4GB of memory. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:10 +02:00
Michel Thierry	69876bed7e	drm/i915/gen8: page directories rework allocation Start using gen8_for_each_pdpe macro to allocate the page directories. Similar to PTs, while setting up a page directory, make all entries of the pd point to the scratch pd before mapping (and make all its entries point to the scratch page); this is to be safe in case of out of bound access or proactive prefetch. Systems without LLC require an explicit flush. v2: Rebased after s/free_pt_*/unmap_and_free_pt/ change. v3: Rebased after teardown va range logic was removed. v4: Keep setting up all page directories for systems with less than 4GB of memory. v5: Initialize PDs. (Mika) v6: Initialize also the extra PDs from systems with less than 4GB of memory. (Mika) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:09 +02:00
Michel Thierry	9271d959dc	drm/i915/gen8: Add dynamic allocation macros and helper functions Similar to gen6, we will use for_each_pde/for_each_pdpe and pte/pde/pdpe_index to iterate over these new structures. v2: Match trace_i915_va_teardown params v3: Multiple rebases. v4: Updated to use unmap_and_free_pt. v5: teardown_va_range logic no longer needed. v6: Rebase after s/page_tables/page_table/. v7: Renamed commit to match what it does now (it was "Use dynamic allocation idioms on free"). v8: Prevent (harmless) out of range access in gen8_for_each_pde and gen8_for_each_pdpe_e. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [danvet: s/BUG/WARN/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:09 +02:00
Michel Thierry	5a8e994352	drm/i915/gen8: Initialize page tables Similar to gen6, while setting up a page table, make all entries of the pt point to the scratch page before mapping; this is to be safe in case of out of bound access or proactive prefetch. Systems without LLC require an explicit flush. v2: Expanded commit text and fixed indentation (Mika) Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:08 +02:00
Michel Thierry	9c57f07001	drm/i915: Remove unnecessary gen8_ppgtt_unmap_pages We are already unmapping them in gen8_ppgtt_free. This function became redundant since commit `06fda602db` ("drm/i915: Create page table allocators"). Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:08 +02:00
Michel Thierry	ec565b3c15	drm/i915: Remove _entry from PPGTT page structures Lets try to keep this consistent: Page Directory Pointer (PDP). Page Directory (PD), also known as page directory pointer entries. Page Table (PT), also known as page directory entries. s/struct i915_page_table_entry/struct i915_page_table/ s/struct i915_page_directory_entry/struct i915_page_directory/ s/struct i915_page_directory_pointer_entry/struct i915_page_directory_pointer/ Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:07 +02:00
Chris Wilson	94f8cf109e	drm/i915: Record ring->start address in error state This is mostly useful for execlists where the rings switch between contexts (and so checking that the ring's start register matches the context is important). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:07 +02:00
Chris Wilson	b0da1b79aa	drm/i915: Suppress empty lines from debugfs/i915_gem_objects This is just so that I don't have to read about the batch pool on systems that are not using it! Rather than using a newline between the kernel clients and userspace clients, just distinguish the internal allocations with a '[k]' Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:06 +02:00
Chris Wilson	481a3d43b9	drm/i915: Include active flag when describing objects in debugfs Since we use obj->active as a hint in many places throughout the code, knowing its state in debugfs is extremely useful. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:06 +02:00
Chris Wilson	8d9d5744c6	drm/i915: Split batch pool into size buckets Now with the trimmed memcpy before the command parser, we try to allocate many different sizes of batches, predominantly one or two pages. We can therefore speed up searching for a good sized batch by keeping the objects of buckets of roughly the same size. v2: Add a comment about bucket sizes Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:05 +02:00
Chris Wilson	35c94185c5	drm/i915: Free batch pool when idle At runtime, this helps ensure that the batch pools are kept trim and fast. Then at suspend, this releases memory that we do not need to restore. It also ties into the oom-notifier to ensure that we recover as much kernel memory as possible during OOM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:05 +02:00
Chris Wilson	06fbca713e	drm/i915: Split the batch pool by engine I woke up one morning and found 50k objects sitting in the batch pool and every search seemed to iterate the entire list... Painting the screen in oils would provide a more fluid display. One issue with the current design is that we only check for retirements on the current ring when preparing to submit a new batch. This means that we can have thousands of "active" batches on another ring that we have to walk over. The simplest way to avoid that is to split the pools per ring and then our LRU execution ordering will also ensure that the inactive buffers remain at the front. v2: execlists still requires duplicate code. v3: execlists requires more duplicate code Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:04 +02:00
Chris Wilson	de4e783a3f	drm/i915: Tidy batch pool logic Move the madvise logic out of the execbuffer main path into the relatively rare allocation path, making the execbuffer manipulation less fragile. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:04 +02:00
Chris Wilson	ed9ddd25b2	drm/i915: Split i915_gem_batch_pool into its own header In the next patch, I want to use the structure elsewhere and so require it defined earlier. Rather than move the definition to an earlier location where it feels very odd, place it in its own header file. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:03 +02:00
Chris Wilson	7c27f52509	drm/i915: Re-enable RPS wait-boosting for all engines This reverts commit `ec5cc0f9b0` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine The premise that media/blitter workloads are not affected by boosting is patently false with a trip through igt. The question that remains is what exactly is going wrong with the media workload that prompted this? Hopefully that would be fixed by the missing agressive downclocking, in addition to the extra restrictions imposed on how frequent a process is allowed to boost. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll> Acked-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:03 +02:00
Chris Wilson	1854d5ca0d	drm/i915: Deminish contribution of wait-boosting from clients With boosting for missed pageflips, we have a much stronger indication of when we need to (temporarily) boost GPU frequency to ensure smooth delivery of frames. So now only allow each client to perform one RPS boost in each period of GPU activity due to stalling on results. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:02 +02:00
Chris Wilson	6ad790c0f5	drm/i915: Boost GPU frequency if we detect outstanding pageflips If we hit a vblank and see that have a pageflip queue but not yet processed, ensure that the GPU is running at maximum in order to clear the backlog. Pageflips are only queued for the following vblank, if we miss it, there will be a visible stutter. Boosting the GPU frequency doesn't prevent us from missing the target vblank, but it should help the subsequent frames hitting theirs. v2: Reorder vblank vs flip-complete so that we only check for a missed flip after processing the completion events, and avoid spurious boosts. v3: Rename missed_vblank v4: Rebase v5: Cancel the outstanding work in runtime suspend v6: Rebase v7: Rebase required fixing Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Deepak S<deepak.s@linux.intel.com> Reviewed-by: Deepak S<deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:02 +02:00
Chris Wilson	edcf284bfe	drm/i915: Fix computation of last_adjustment for RPS autotuning The issue is that by computing the last_adj value after applying the clamping, we can end up with a bogus value for feeding into the next RPS autotuning step. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Deepak S <deepak.s@linux.intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:01 +02:00
Chris Wilson	8fb55197e6	drm/i915: Agressive downclocking on Baytrail Reuse the same reclocking strategy for Baytail as on its bigger brethren, Sandybridge and Ivybridge. In particular, this makes the device quicker to reclock (both up and down) though the tendency now is to downclock more aggressively to compensate for the RPS boosts. v2: Rebase v3: Exclude Cherrytrail as Deepak was concerned that the increased number of register writes would wake the common powerwell too often. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Deepak S <deepak.s@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Deepak S <deepak.s@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:01 +02:00
Chris Wilson	cf5d8a46a0	drm/i915: Fix the flip synchronisation to consider mmioflips Currently we emit semaphore synchronisation as if we were going to flip using the target CS engine, but we then change our minds and do the flip using the CPU. Consequently we write instructions to the ring but never use them - even to the point of filling that ring up entirely and never submitting a request. The wrinkle in the ointment is that we have to tell a white lie to pin-to-display for it to skip the synchronisation for mmioflips as we will create a task specifically for that slow synchronisation. An oddity of note is the discrepancy in requests that we tell to pin-display to serialise to and that we then eventually wait upon. This is due to a limitation in the i915_gem_object_sync() routine that will be lifted later. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:00 +02:00
Chris Wilson	ee286370d6	drm/i915: Cache last obj->pages location for i915_gem_object_get_page() The biggest user of i915_gem_object_get_page() is the relocation processing during execbuffer. Typically userspace passes in a set of relocations in sorted order. Sadly, we alternate between relocations increasing from the start of the buffers, and relocations decreasing from the end. However the majority of consecutive lookups will still be in the same page. We could cache the start of the last sg chain, however for most callers, the entire sgl is inside a single chain and so we see no improve from the extra layer of caching. v2: Avoid the double increment inside unlikely() References: https://bugs.freedesktop.org/show_bug.cgi?id=88308 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:56:00 +02:00
Damien Lespiau	f9fc42f4bd	drm/i915/skl: Implement WaDisableVFUnitClockGating Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:59 +02:00
Damien Lespiau	669506e781	drm/i915/skl: Fix stepping check for a couple of W/As Both WaDisableSDEUnitClockGating and WaSetGAPSunitClckGateDisable are needed on B0 as well. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:59 +02:00
Arun Siluvery	51847fb99f	drm/i915: Do not set L3-LLC Coherency bit in ctx descriptor According to Spec this is a reserved bit for Gen9+ and should not be set. Change-Id: I0215fb7057b94139b7a2f90ecc7a0201c0c93ad4 Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:58 +02:00
Maarten Lankhorst	b833bb61fd	drm/i915: use kref_put_mutex in i915_gem_request_unreference__unlocked Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:58 +02:00
Ander Conselvan de Oliveira	9b4fd8f250	drm/i915: Don't use staged config in intel_mst_pre_enable_dp() For the conversion to atomic. The pre_enable() hooks are called as part of the crtc enable sequence, at which point the staged config was already made effective. Furthermore, the function actually changes hardware state, so it should anyway deal with current and not staged config. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:57 +02:00
Ander Conselvan de Oliveira	98a221da4a	drm/i915: Don't use staged config in check_encoder_cloning() Reduce dependency on the staged config by using the atomic state instead. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-04-10 08:55:57 +02:00

1 2 3 4 5 ...

8461 Commits