linux

Author	SHA1	Message	Date
Daniel Vetter	308887aad1	drm/i915: fix reset handling in the throttle ioctl While auditing the code I've noticed one place (the throttle ioctl) which does not yet wait for the reset handler to complete and doesn't properly decode the wedge state into -EAGAIN/-EIO. Fix this up by calling the right helpers. This might explain the oddball "my compositor just died in a successfull gpu reset" reports. Or maybe not, since current mesa doesn't use this ioctl to throttle command submission. The throttle ioctl doesn't take the struct_mutex, so to avoid busy-looping with -EAGAIN while a reset is in process, check for errors first and wait for the handler to complete if a reset is pending by calling i915_gem_wait_for_error. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	33196dedda	drm/i915: move wedged to the other gpu error handling stuff And to make Ben Widawsky happier, use the gpu_error instead of the entire device as the argument in some functions. Drop the outdated comment on ->wedged for now, a follow-up patch will change the semantics and add a proper comment again. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:15 +01:00
Daniel Vetter	99584db33b	drm/i915: extract hangcheck/reset/error_state state into substruct This has been sprinkled all over the place in dev_priv. I think it'd be good to also move all the code into a separate file like i915_gem_error.c, but that's for another patch. Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:11:14 +01:00
Ben Widawsky	93d187993b	drm/i915: Remove use of gtt_mappable_entries Mappable_end, ie. size is almost always what you want as opposed to the number of entries. Since we already have that information, we can scrap the number of entries and only calculate it when needed. If gtt_start is !0, this will have slightly different behavior. This difference can only occur in DRI1, and exists when we try to kick out the firmware fb. The new code seems like a bugfix to me. The other case where we've changed the behavior is during init we check the mappable region against our current known upper and lower limits (64MB, and 512MB). This now matches the comment, and makes things more convenient after removing gtt_mappable_entries. Also worth noting is the setting of mappable_end is taken out of setup because we do it earlier now in the DRI2 case and therefore need to add that tiny hunk to support the DRI1 IOCTL. v2: Move up mappable end to before legacy AGP init v3: Add the dev_priv inclusion here from previous rebase error in patch 5 Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> (v2) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: squash in fix for a printk format flag mismatch warning.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-20 13:09:20 +01:00
Ben Widawsky	5d4545aef5	drm/i915: Create a gtt structure The purpose of the gtt structure is to help isolate our gtt specific properties from the rest of the code (in doing so it help us finish the isolation from the AGP connection). The following members are pulled out (and renamed): gtt_start gtt_total gtt_mappable_end gtt_mappable gtt_base_addr gsm The gtt structure will serve as a nice place to put gen specific gtt routines in upcoming patches. As far as what else I feel belongs in this structure: it is meant to encapsulate the GTT's physical properties. This is why I've not added fields which track various drm_mm properties, or things like gtt_mtrr (which is itself a pretty transient field). Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> [Ben modified commit messages] Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:33:56 +01:00
Chris Wilson	43e28f092b	drm/i915: Bail if we attempt to allocate pages for a purged object Move the existing checking inside bind_to_gtt() to the more appropriate layer in order to prevent recreation of the pages after they have been explicitly truncated. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:59 +01:00
Chris Wilson	dd624afd53	drm/i915: Add a debug interface to forcibly evict and shrink our object caches As a means to investigate some bad system behaviour related to the purging of the active, inactive and unbound lists, it is useful to be able to manually control when those lists should be cleared. v2: use _safe list iterators as we kick objects from the list as we walk. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment explaining why we don't need to check and wait for gpu resets, acked by Chris on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:57 +01:00
Imre Deak	0fa8779651	drm/i915: use gtt_get_size() instead of open coding it Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:56 +01:00
Imre Deak	56c844e539	drm/i915: merge {i965, sandybridge}_write_fence_reg() The two functions are rather similar, so merge them. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:55 +01:00
Imre Deak	d865110cc2	drm/i915: merge get_gtt_alignment/get_unfenced_gtt_alignment() The two functions are rather similar, so merge them. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-17 22:07:54 +01:00
Dave Airlie	b5cc6c0387	Merge tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: - seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris Wilson - some leftover kill-agp on gen6+ patches from Ben - hotplug improvements from Damien - clear fb when allocated from stolen, avoids dirt on the fbcon (Chris) - Stolen mem support from Chris Wilson, one of the many steps to get to real fastboot support. - Some DDI code cleanups from Paulo. - Some refactorings around lvds and dp code. - some random little bits&pieces * tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits) drm/i915: Return the real error code from intel_set_mode() drm/i915: Make GSM void drm/i915: Move GSM mapping into dev_priv drm/i915: Move even more gtt code to i915_gem_gtt drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno drm/i915: Introduce i915_gem_set_seqno() drm/i915: Always clear semaphore mboxes on seqno wrap drm/i915: Initialize hardware semaphore state on ring init drm/i915: Introduce ring set_seqno drm/i915: Missed conversion to gtt_pte_t drm/i915: Bug on unsupported swizzled platforms drm/i915: BUG() if fences are used on unsupported platform drm/i915: fixup overlay stolen memory leak drm/i915: clean up PIPECONF bpc #defines drm/i915: add intel_dp_set_signal_levels drm/i915: remove leftover display.update_wm assignment drm/i915: check for the PCH when setting pch_transcoder drm/i915: Clear the stolen fb before enabling drm/i915: Access to snooped system memory through the GTT is incoherent drm/i915: Remove stale comment about intel_dp_detect() ... Conflicts: drivers/gpu/drm/i915/intel_display.c	2013-01-17 20:34:08 +10:00
Daniel Vetter	93927ca52a	drm/i915: Revert shrinker changes from "Track unbound pages" This partially reverts commit `6c085a728c` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Aug 20 11:40:46 2012 +0200 drm/i915: Track unbound pages Closer inspection of that patch revealed a bunch of unrelated changes in the shrinker: - The shrinker count is now in pages instead of objects. - For counting the shrinkable objects the old code only looked at the inactive list, the new code looks at all bounds objects (including pinned ones). That is obviously in addition to the new unbound list. - The shrinker cound is no longer scaled with sysctl_vfs_cache_pressure. Note though that with the default tuning value of vfs_cache_pressue = 100 this doesn't affect the shrinker behaviour. - When actually shrinking objects, the old code first dropped purgeable objects, then normal (inactive) objects. Only then did it, in a last-ditch effort idle the gpu and evict everything. The new code omits the intermediate step of evicting normal inactive objects. Safe for the first change, which seems benign, and the shrinker count scaling, which is a bit a different story, the endresult of all these changes is that the shrinker is _much_ more likely to fall back to the last-ditch resort of idling the gpu and evicting everything. The old code could only do that if something else evicted lots of objects meanwhile (since without any other changes the nr_to_scan will be smaller than the object count). Reverting the vfs_cache_pressure behaviour itself is a bit bogus: Only dentry/inode object caches should scale their shrinker counts with vfs_cache_pressure. Originally I've had that change reverted, too. But Chris Wilson insisted that it's too bogus and shouldn't again see the light of day. Hence revert all these other changes and restore the old shrinker behaviour, with the minor adjustment that we now first scan the unbound list, then the inactive list for each object category (purgeable or normal). A similar patch has been tested by a few people affected by the gen4/5 hangs which started to appear in 3.7, which some people bisected to the "drm/i915: Track unbound pages" commit. But just disabling the unbound logic alone didn't change things at all. Note that this patch doesn't fix the referenced bugs, it only hides the underlying bug(s) well enough to restore pre-3.7 behaviour. The key to achieve that is to massively reduce the likelyhood of going into a full gpu stall and evicting everything. v2: Reword commit message a bit, taking Chris Wilson's comment into account. v3: On Chris Wilson's insistency, do not reinstate the rather bogus vfs_cache_pressure change. Tested-by: Greg KH <gregkh@linuxfoundation.org> Tested-by: Dave Kleikamp <dave.kleikamp@oracle.com> References: https://bugs.freedesktop.org/show_bug.cgi?id=55984 References: https://bugs.freedesktop.org/show_bug.cgi?id=57122 References: https://bugs.freedesktop.org/show_bug.cgi?id=56916 References: https://bugs.freedesktop.org/show_bug.cgi?id=57136 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-10 18:02:44 +01:00
Chris Wilson	93be8788e6	drm/i915; Only increment the user-pin-count after successfully pinning the bo As along the error path we do not correct the user pin-count for the failure, we may end up with userspace believing that it has a pinned object at offset 0 (when interrupted by a signal for example). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2013-01-07 10:30:53 +01:00
Dave Airlie	8be0e5c427	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Some fixes for 3.8: - Watermark fixups from Chris Wilson (4 pieces). - 2 snb workarounds, seem to be recently added to our internal DB. - workaround for the infamous i830/i845 hang, seems now finally solid! Based on Chris' fix for SNA, now also for UXA/mesa&old SNA. - Some more fixlets for shrinker-pulls-the-rug issues (Chris&me). - Fix dma-buf flags when exporting (you). - Disable the VGA plane if it's enabled on lid open - similar fix in spirit to the one I've sent you last weeek, BIOS' really like to mess with the display when closing the lid (awesome debug work from Krzysztof Mazur). * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: disable shrinker lock stealing for create_mmap_offset drm/i915: optionally disable shrinker lock stealing drm/i915: fix flags in dma buf exporting i915: ensure that VGA plane is disabled drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager drm: Export routines for inserting preallocated nodes into the mm manager drm/i915: don't disable disconnected outputs drm/i915: Implement workaround for broken CS tlb on i830/845 drm/i915: Implement WaSetupGtModeTdRowDispatch drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled drm/i915: Prefer CRTC 'active' rather than 'enabled' during WM computations drm/i915: Clear self-refresh watermarks when disabled drm/i915: Double the cursor self-refresh latency on Valleyview drm/i915: Fixup cursor latency used for IVB lp3 watermarks	2012-12-30 13:54:12 +10:00
Ben Widawsky	d7e5008f7c	drm/i915: Move even more gtt code to i915_gem_gtt This really should have been part of the kill agp series. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 16:27:35 +01:00
Daniel Vetter	da494d7ca5	drm/i915: disable shrinker lock stealing for create_mmap_offset The mmap offset structure is not part of the drm/i915 code, but provided by gem helpers. To avoid leaky abstractions (by either depending upon implementation details of said helper wrt to preallocations, or reimplementing it in our code and so fuzzing around in internal details of that helpr) simply disable the shrinker lock stealing accross calls into the helper functions. This should fix igt/gem_tiled_swapping. v2: Fix cleanup path confusion bemoaned by Chris Wilson. Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 14:57:35 +01:00
Daniel Vetter	677feac291	drm/i915: optionally disable shrinker lock stealing commit `5774506f15` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Nov 21 13:04:04 2012 +0000 drm/i915: Borrow our struct_mutex for the direct reclaim added a nice trick to steal the struct_mutex lock in the shrinker if it's the current task holding it. But this also caused the requirement that every place which allocates memory needs to be careful about the gem state of objects, since the shrinker could have pulled the rug out from under it. We've usually solved this by carefully preallocating things or ensure that buffers are pinned already. But the shrinker also reaps mmap offset, so allocating those needs to be careful, too. Now that code has been factored out into some common helpers, so either we have fragile code depending upon the common helper not doing something we don't want it to do. Or we need to reimplement the mmap offset creation and so also leak implementation details into our code. Since this all results in leaky abstraction, cop out by disabling the lock borrowing trick while calling down into the helpers. That way our craziness is nicely confined to files in drm/i915. v2: Split out the change to create_mmap_offset as request by Chris Wilson. Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-20 14:56:04 +01:00
Mika Kuoppala	fca26bb453	drm/i915: Introduce i915_gem_set_seqno() This function can be used to set the driver's next_seqno to arbitrary value. i915_gem_set_seqno() will idle the gpu, retire outstanding requests, clear the semaphore mailboxes and set the hardware status page's seqno index. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:25:10 +01:00
Mika Kuoppala	ba1a7067c0	drm/i915: Always clear semaphore mboxes on seqno wrap In preparation for setting the seqno to arbitrary value on init or through debugfs. We need to always clear the semaphores and set the hws page seqno index by calling intel_ring_init_seqno(). v2: rewrote the commit message as suggested by Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:41 +01:00
Mika Kuoppala	f7e98ad4d4	drm/i915: Initialize hardware semaphore state on ring init Hardware status page needs to have proper seqno set as our initial seqno can be arbitrary. If initial seqno is close to wrap boundary on init and i915_seqno_passed() (31bit space) refers to hw status page which contains zero, errorneous result will be returned. v2: clear mboxes and set hws page directly instead of going through rings. Suggested by Chris Wilson. v3: hws needs to be updated for all gens. Noticed by Chris Wilson. References: https://bugs.freedesktop.org/show_bug.cgi?id=58230 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-19 11:17:01 +01:00
Ben Widawsky	8782e26c0c	drm/i915: Bug on unsupported swizzled platforms Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:31:23 +01:00
Ben Widawsky	7dbf9d6e0f	drm/i915: BUG() if fences are used on unsupported platform Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:29:55 +01:00
Chris Wilson	dc9dd7a20f	drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager As we may reap neighbouring objects in order to free up pages for allocations, we need to be careful not to allocate in the middle of the drm_mm manager. To accomplish this, we can simply allocate the drm_mm_node up front and then use the combined search & insert drm_mm routines, reducing our code footprint in the process. Fixes (partially) i-g-t/gem_tiled_swapping Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> [danvet: Again fixup atomic bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-18 22:02:29 +01:00
Linus Torvalds	3c2e81ef34	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull DRM updates from Dave Airlie: "This is the one and only next pull for 3.8, we had a regression we found last week, so I was waiting for that to resolve itself, and I ended up with some Intel fixes on top as well. Highlights: - new driver: nvidia tegra 20/30/hdmi support - radeon: add support for previously unused DMA engines, more HDMI regs, eviction speeds ups and fixes - i915: HSW support enable, agp removal on GEN6, seqno wrapping - exynos: IPP subsystem support (image post proc), HDMI - nouveau: display class reworking, nv20->40 z compression - ttm: start of locking fixes, rcu usage for lookups, - core: documentation updates, docbook integration, monotonic clock usage, move from connector to object properties" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (590 commits) drm/exynos: add gsc ipp driver drm/exynos: add rotator ipp driver drm/exynos: add fimc ipp driver drm/exynos: add iommu support for ipp drm/exynos: add ipp subsystem drm/exynos: support device tree for fimd radeon: fix regression with eviction since evict caching changes drm/radeon: add more pedantic checks in the CP DMA checker drm/radeon: bump version for CS ioctl support for async DMA drm/radeon: enable the async DMA rings in the CS ioctl drm/radeon: add VM CS parser support for async DMA on cayman/TN/SI drm/radeon/kms: add evergreen/cayman CS parser for async DMA (v2) drm/radeon/kms: add 6xx/7xx CS parser for async DMA (v2) drm/radeon: fix htile buffer size computation for command stream checker drm/radeon: fix fence locking in the pageflip callback drm/radeon: make indirect register access concurrency-safe drm/radeon: add W\|RREG32_IDX for MM_INDEX\|DATA based mmio accesss drm/exynos: support extended screen coordinate of fimd drm/exynos: fix x, y coordinates for right bottom pixel drm/exynos: fix fb offset calculation for plane ...	2012-12-17 08:26:17 -08:00
Chris Wilson	eb119bd612	drm/i915: Access to snooped system memory through the GTT is incoherent We ignore all the user requests to handle flushing to the GTT domain if the user requests such on a snoopable bo, and as such access through the GTT to such pages remains incoherent. The specs even warn that such behaviour is undefined - a strong reason never to do so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-17 12:28:23 +01:00
Mika Kuoppala	9e8e36879f	drm/i915: Set initial seqno value close to wrap boundary To gain confidence in the wrap handling, make it happen quite soon after the boot. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 14:07:22 +01:00
Chris Wilson	107f27a5df	drm/i915: Open-code i915_gpu_idle() for handling seqno wrapping The complication is that during seqno wrapping we must be extremely careful not to write to any ring as that will require a new seqno, and so would recurse back into the seqno wrap handler. So we cannot call i915_gpu_idle() as that does additional work beyond simply retiring the current set of requests, and instead must do the minimal work ourselves during seqno wrapping. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 14:07:03 +01:00
Mika Kuoppala	f72b3435c1	drm/i915: Don't emit semaphore wait if wrap happened If wrap just happened we need to prevent emitting waits for pre wrap values. Detect this and emit no-ops instead. v2: Use olr > seqno to detect wrap instead of *seqno == 0 as suggested by Chris Wilson. v3: Use last used seqno to detect the wraparound. From Chris Wilson v4: Fixed unnecessary last_seqno assigment References: https://bugs.freedesktop.org/show_bug.cgi?id=57967 Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-11 13:32:26 +01:00
Linus Torvalds	caf491916b	Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage This reverts commits `a50915394f` and `d7c3b937bd`. This is a revert of a revert of a revert. In addition, it reverts the even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the original commits in linux-next. It turns out that the original patch really was bogus, and that the original revert was the correct thing to do after all. We thought we had fixed the problem, and then reverted the revert, but the problem really is fundamental: waking up kswapd simply isn't the right thing to do, and direct reclaim sometimes simply _is_ the right thing to do. When certain allocations fail, we simply should try some direct reclaim, and if that fails, fail the allocation. That's the right thing to do for THP allocations, which can easily fail, and the GPU allocations want to do that too. So starting kswapd is sometimes simply wrong, and removing the flag that said "don't start kswapd" was a mistake. Let's hope we never revisit this mistake again - and certainly not this many times ;) Acked-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-10 11:03:05 -08:00
Chris Wilson	e9b73c6739	drm/i915: Reduce memory pressure during shrinker by preallocating swizzle pages On a machine with bit17 swizzling, we need to store the bit17 of the physical page address in put-pages. This requires a memory allocation, on average less than a page, which may be difficult to satisfy is the request to put-pages is on behalf of the shrinker. We could allow that allocation to pull from the reserved memory pools, but it seems much safer to preallocate the array for tiled objects on affected machines. v2: Export i915_gem_object_needs_bit17_swizzle() for reuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-07 01:16:15 +01:00
Mika Kuoppala	498d2ac15c	drm/i915: Add intel_ring_handle_seqno wrap If there are pre-wrap values in semaphore-mbox registers after wrap, syncing against some after-wrap request will complete immediately. Fix this by emitting ring commands to set mbox registers to zero when the wrap happens. v2: Use __intel_ring_begin to emit ring commands, from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add a small comment to handle_seqno_wrap.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-06 13:14:34 +01:00
Daniel Vetter	1a240d4de2	drm/i915: fixup sparse warnings - __iomem where there is none (I love how we mix these things up). - Use gfp_t instead of an other plain type. - Unconfuse one place about enum pipe vs enum transcoder - for the pch transcoder we actually use the pipe enum. Fixup the other cases where we assign the pipe to the cpu transcoder with explicit casts. - Declare the mch_lock properly in a header. There is still a decent mess in intel_bios.c about __iomem, but heck, this is x86 and we're allowed to do that. Makes-sparse-happy: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Use a space after the cast consistently and fix up the newly-added cast in i915_irq.c to properly use __iomem.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 22:31:04 +01:00
Damien Lespiau	4239ca779d	drm/i915: Fix dieing -> dying typo Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 18:25:02 +01:00
Chris Wilson	a2165e3123	drm/i915: Decouple the object from the unbound list before freeing pages As we may actually allocate in order to save the physical swizzling bits during the free, we have to be careful not to trigger the shrinker on the same object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added a small comment in the code to really drive the scariness of this patch home.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-12-03 17:22:16 +01:00
Chris Wilson	42dcedd4f2	drm/i915: Use a slab for object allocation The primary purpose of this was to debug some use-after-free memory corruption that was causing an OOPS inside drm/i915. As it turned out the corruption was being caused elsewhere and i915.ko as a major user of many objects was being hit hardest. Indeed as we do frequent the generic kmalloc caches, dedicating one to ourselves (or at least naming one for us depending upon the core) aids debugging our own slab usage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:44:05 +01:00
Chris Wilson	0104fdbb84	drm/i915: Introduce i915_gem_object_create_stolen() Allow for the creation of GEM objects backed by stolen memory. As these are not backed by ordinary pages, we create a fake dma mapping and store the address in the scatterlist rather than obj->pages. v2: Mark _i915_gem_object_create_stolen() as static, as noticed by Jesse Barnes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-30 23:34:16 +01:00
Daniel Vetter	8dcf015eb9	drm/i915: optimize the shmem_pwrite slowpath handling Since we drop dev->struct_mutex when going through the slowpath, the object might have been moved out of the cpu domain. Hence we need to clflush the entire object to ensure that after the ioctl returns, everything is coherent again (interwoven writes are ill-defined anyway). But we only need to do this if we start in the cpu domain and the object requires flushing for coherency. So don't do the flushing if the object is coherent anyway or if we've done in-line clfushing already. v2: i915_gem_clflush_object already checks whether the object is coherent and if so, drops the flushing. Hence we don't need to check that ourselves, simplifying the condition. v3: Reorder the checks for better clarity (and adjust the comment accordingly), suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 13:49:08 +01:00
Daniel Vetter	a39a68054f	drm/i915: simplify shmem pwrite/pread slowpath handling The shmem paths for pwrite/pread used a clever trick to hold onto a single page when dropping the big dev->struct_mutex for the slowpath. But this ran the risk of reinstating (or not completely purging) the backing storage when dropping purgeable objects. Hence the code needed to keep track of whether it ever dropped the lock, and if it did, manually check whether it needs to re-purge the backing storage. But thanks to the pages pin count introduced in commit `a5570178c0` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:54 2012 +0100 drm/i915: Pin backing pages whilst exporting through a dmabuf vmap which allowed us to pin the backing storage and remove that page reference trick from shmem_pwrite/read in commit `f60d7f0c1d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:56 2012 +0100 drm/i915: Pin backing pages for pread and commit `755d22184f` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Sep 4 21:02:55 2012 +0100 drm/i915: Pin backing pages for pwrite we can now abolish this check. The slowpath cleanup completely disappears from pread, and for pwrite we're only left with the domain fixup in case someone moved the object out of the cpu domain from under us. A follow-on patch will optimize that a notch more. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 13:48:34 +01:00
Mika Kuoppala	7b01e260a6	drm/i915: Set sync_seqno properly after seqno wrap i915_gem_handle_seqno_wrap() will zero all sync_seqnos but as the wrap can happen inside ring->sync_to(), pre wrap seqno was carried over and overwrote the zeroed sync_seqno. When wrap is handled, all outstanding requests will be retired and objects moved to inactive queue, causing their last_read_seqno to be zero. Use this to update the sync_seqno correctly. RING_SYNC registers after wrap will contain pre wrap values which are >= seqno. So injecting the semaphore wait into ring completes immediately. Original idea for using last_read_seqno from Chris Wilson. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:54 +01:00
Chris Wilson	3e9605018a	drm/i915: Rearrange code to only have a single method for waiting upon the ring Replace the wait for the ring to be clear with the more common wait for the ring to be idle. The principle advantage is one less exported intel_ring_wait function, and the removal of a hardcoded value. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	b662a06632	drm/i915: Simplify flushing activity on the ring As we now always preallocate the seqno before writing to the ring, we can trivially test if we have any pending activity on the ring by inspecting the olr. This makes it then possible to flush operations that are not normally associated with a request, like power-management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:53 +01:00
Chris Wilson	9d7730914f	drm/i915: Preallocate next seqno before touching the ring Based on the work by Mika Kuoppala, we realised that we need to handle seqno wraparound prior to committing our changes to the ring. The most obvious point then is to grab the seqno inside intel_ring_begin(), and then to reuse that seqno for all ring operations until the next request. As intel_ring_begin() can fail, the callers must already be prepared to handle such failure and so we can safely add further checks. This patch looks like it should be split up into the interface changes and the tweaks to move seqno wrapping from the execbuffer into the core seqno increment. However, I found no easy way to break it into incremental steps without introducing further broken behaviour. v2: Mika found a silly mistake and a subtle error in the existing code; inside i915_gem_retire_requests() we were resetting the sync_seqno of the target ring based on the seqno from this ring - which are only related by the order of their allocation, not retirement. Hence we were applying the optimisation that the rings were synchronised too early, fortunately the only real casualty there is the handling of seqno wrapping. v3: Do not forget to reset the sync_seqno upon module reinitialisation, ala resume. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861 Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:52 +01:00
Chris Wilson	b5d177946a	drm/i915: Wait upon the last request seqno, rather than a future seqno In commit `69c2fc8913` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 12:41:03 2012 +0100 drm/i915: Remove the per-ring write list the explicit flush was removed from i915_ring_idle(). However, we continued to wait upon the next seqno which now did not correspond to any request (except for the unusual condition of a failure to queue a request after execbuffer) and so would wait indefinitely. This has an important side-effect that i915_gpu_idle() does not cause the seqno to be incremented. This is vital if we are to be able to idle the GPU to handle seqno wraparound, as in subsequent patches. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-29 11:43:51 +01:00
Chris Wilson	5774506f15	drm/i915: Borrow our struct_mutex for the direct reclaim If we have hit oom whilst holding our struct_mutex, then currently we cannot reap our own GPU buffers which likely pin most of memory, making an outright OOM more likely. So if we are running in direct reclaim and already hold the mutex, attempt to free buffers knowing that the original function can not continue until we return. v2: Add a note explaining that the mutex may be stolen due to pre-emption, and that is bad. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:47:14 +01:00
Chris Wilson	8742267af4	drm/i915: Defer assignment of obj->gtt_space until after all possible mallocs As we may invoke the shrinker whilst trying to allocate memory to hold the gtt_space for this object, we need to be careful not to mark the drm_mm_node as activated (by assigning it to this object) before we have finished our sequence of allocations. Note: We also need to move the binding of the object into the actual pagetables down a bit. The best way seems to be to move it out into the callsites. Reported-by: Imre Deak <imre.deak@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added small note to commit message to summarize review discussion.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:47:13 +01:00
Chris Wilson	c9839303d1	drm/i915: Pin the object whilst faulting it in In order to prevent reaping of the object whilst setting it up to handle the pagefault, we need to mark it as pinned. This has the nice side-effect of eliminating some special cases from the pagefault handler as well! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:04 +01:00
Chris Wilson	fbdda6fb5e	drm/i915: Guard pages being reaped by OOM whilst binding-to-GTT In the circumstances that the shrinker is allowed to steal the mutex in order to reap pages, we need to be careful to prevent it operating on the current object and shooting ourselves in the foot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-21 17:45:04 +01:00
Dave Airlie	9fabd4eede	Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Highlights of this -next round: - ivb fdi B/C fixes - hsw sprite/plane offset fixes from Damien - unified dp/hdmi encoder for hsw, finally external dp support on hsw (Paulo) - kill-agp and some other prep work in the gtt code from Ben - some fb handling fixes from Ville - massive pile of patches to align hsw VGA with the spec and make it actually work (Paulo) - pile of workarounds from Jesse, mostly for vlv, but also some other related platforms - start of a dev_priv reorg, that thing grew out of bounds and chaotic - small bits&pieces all over the place, down to better error handling for load-detect on gen2 (Chris, Jani, Mika, Zhenyu, ...) On top of the previous pile (just copypasta): - tons of hsw dp prep patches form Paulo - round scheduled work items and timers to nearest second (Chris) - some hw workarounds (Jesse&Damien) - vlv dp support and related fixups (Vijay et al.) - basic haswell dp support, not yet wired up for external ports (Paulo) - edp support (Paulo) - tons of refactorings to prepare for the above (Paulo) - panel rework, unifiying code between lvds and edp panels (Jani) - panel fitter scaling modes (Jani + Yuly Novikov) - panel power improvements, should now work without the BIOS setting it up - extracting some dp helpers from radeon/i915 and move them to drm_dp_helper.c - randome pile of workarounds (Damien, Ben, ...) - some cleanups for the register restore code for suspend/resume - secure batchbuffer support, should enable tear-free blits on gen6+ Chris) - random smaller fixlets and cleanups. * 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (231 commits) drm/i915: Restore physical HWS_PGA after resume drm/i915: Report amount of usable graphics memory in MiB drm/i915/i2c: Track users of GMBUS force-bit drm/i915: Allocate the proper size for contexts. drm/i915: Update load-detect failure paths for modeset-rework drm/i915: Clear unused fields of mode for framebuffer creation drm/i915: Always calculate 8xx WM values based on a 32-bpp framebuffer drm/i915: Fix sparse warnings in from AGP kill code drm/i915: Missed lock change with rps lock drm/i915: Move the remaining gtt code drm/i915: flush system agent TLBs on SNB drm/i915: Kill off now unused gen6+ AGP code drm/i915: Calculate correct stolen size for GEN7+ drm/i915: Stop using AGP layer for GEN6+ drm/i915: drop the double-OP_STOREDW usage in blt_ring_flush drm/i915: don't rewrite the GTT on resume v4 drm/i915: protect RPS/RC6 related accesses (including PCU) with a new mutex drm/i915: put ring frequency and turbo setup into a work queue v5 drm/i915: don't block resume on fb console resume v2 drm/i915: extract l3_parity substruct from dev_priv ...	2012-11-20 09:22:35 +10:00
Ben Widawsky	26b1ff35c8	drm/i915: Move the remaining gtt code It's pretty much all consolidated now that we've killed AGP. We can move the one outlier, and defines too. (Kill some unused defines in the process) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:44 +01:00
Ben Widawsky	e76e9aebcd	drm/i915: Stop using AGP layer for GEN6+ As a quick hack we make the old intel_gtt structure mutable so we can fool a bunch of the existing code which depends on elements in that data structure. We can/should try to remove this in a subsequent patch. This should preserve the old gtt init behavior which upon writing these patches seems incorrect. The next patch will fix these things. The one exception is VLV which doesn't have the preserved flush control write behavior. Since we want to do that for all GEN6+ stuff, we'll handle that in a later patch. Mainstream VLV support doesn't actually exist yet anyway. v2: Update the comment to remove the "voodoo" Check that the last pte written matches what we readback v3: actually kill cache_level_to_agp_type since most of the flags will disappear in an upcoming patch v4: v3 was actually not what we wanted (Daniel) Make the ggtt bind assertions better and stricter (Chris) Fix some uncaught errors at gtt init (Chris) Some other random stuff that Chris wanted v5: check for i==0 in gen6_ggtt_bind_object to shut up gcc (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by [v4]: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Make the cache_level -> agp_flags conversion for pre-gen6 a tad more robust by mapping everything != CACHE_NONE to the cached agp flag - we have a 1:1 uncached mapping, but different modes of cacheable (at least on later generations). Suggested by Chris Wilson.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:42 +01:00
Daniel Vetter	a4da4fa4e5	drm/i915: extract l3_parity substruct from dev_priv Pretty astonishing how far apart these two members landed ... Especially since I've already removed almost 200 lines in between. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-11-11 23:51:40 +01:00
Daniel Vetter	c2fb791692	Linux 3.7-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQEcBAABAgAGBQJQgvdwAAoJEHm+PkMAQRiG+3AH/i2XsqqN3VctL0nnbWfvds+Q aKulfIdJTjKiVAsawPUtRqReZ8ijiebrgA/53lZLlrFOoPPQ5+LHmnSyQF6gErOY NuAE1lijXDRM1pwBlhvOBbAj26wUobGjqONFJ9OkKr758Ue8ds/Q7UdxyEgmYgmg tvVMzfRcICzryUV3PcqL+3cNPpCUdT6wGGRJ9DCv/jvGiWKExWhOle5oltrmxk+D NsqRcws5pEubfHE4J8BvNWr8lE1kHfYVhrJETiLJUiN2XAJcbI4Jy7rU/3EGteNS 0HMZdaPPjV874lohdM70X2225SbYrCVkAYB5hnZCTeC3tYyCawBBPMQoyAiOcmU= =+861 -----END PGP SIGNATURE----- Merge tag 'v3.7-rc2' into drm-intel-next-queued Linux 3.7-rc2 Backmerge to solve two ugly conflicts: - uapi. We've already added new ioctl definitions for -next. Do I need to say more? - wc support gtt ptes. We've had to revert this for snb+ for 3.7 and also fix a few other things in the code. Now we know how to make it work on snb+, but to avoid losing the other fixes do the backmerge first before re-enabling wc gtt ptes on snb+. And a few other minor things, among them git getting confused in intel_dp.c and seemingly causing a conflict out of nothing ... Conflicts: drivers/gpu/drm/i915/i915_reg.h drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c drivers/gpu/drm/i915/intel_modes.c include/drm/i915_drm.h Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-22 14:34:51 +02:00
Dave Airlie	64acba6a7a	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: The big thing is the disabling of the hsw support by default, cc: stable. We've aimed for basic hsw support in 3.6, but due to a few bad happenstances we've screwed up and only 3.8 will have better modeset support than vesa. To avoid yet another round of fallout from such a gaffle on for the next platform we've added a module option to disable early hw support by default. That should also give us more flexibility in bring-up. Otherwise just small fixes: - 3 fixes from Egbert for sdvo corner cases - invert-brightness quirk entry from Egbert - revert a dp link training change, it regresses some setups - and shut up a spurious WARN in our gem fault handler. - regression fix for an oops on bit17 swizzling machines, introduce in 3.7 - another no-lvds quirk * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Initialize obj->pages before use by i915_gem_object_do_bit17_swizzle() drm/i915: Add no-lvds quirk for Supermicro X7SPA-H drm/i915: Insert i915_preliminary_hw_support variable. drm/i915: shut up spurious WARN in the gtt fault handler Revert "drm/i915: Try harder to complete DP training pattern 1" DRM/i915: Restore sdvo_flags after dtd->mode->dtd Roundrtrip. DRM/i915: Don't clone SDVO LVDS with analog. DRM/i915: Add QUIRK_INVERT_BRIGHTNESS for NCR machines. DRM/i915: Don't delete DPLL Multiplier during DAC init.	2012-10-22 09:55:48 +10:00
Chris Wilson	74ce6b6c63	drm/i915: Initialize obj->pages before use by i915_gem_object_do_bit17_swizzle() If we leave obj->pages set to NULL before attempting to deswizzle them, then an OOPS is well deserved. Fixes regression introduced in commit `9da3da660d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jun 1 15:20:22 2012 +0100 drm/i915: Replace the array of pages with a scatterlist Reported-and-tested-by: Krzysztof Kolasa Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-19 21:52:52 +02:00
Daniel Vetter	a7c2e1aad6	drm/i915: shut up spurious WARN in the gtt fault handler -ENOSPC can happen if userspace is being simplistic and tries to map a too big object. To aid further spurious WARN debugging, also print out the error code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56017 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-17 11:56:40 +02:00
Dave Airlie	3459f62047	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes Daniel writes: "- some register magic to fix hsw crw (Paulo&Ben) - fix backlight destruction for cpu edp (Jani) - fix gen ch7xxx dvo ->get_hw_state - fixup the plane->pipe fixup code, the broken version massively angers the modeset sanity checks - kill pipe A quirk for i855gm, otherwise I get a black screen with the above patch - fixup for gem_get_page helper (Chris) - fixup guardband clipping w/a (Ken), without this mesa master can erronously drop vertices on snb, mesa 9.0 has the optimization reverted - another pageflip vs. modeset fix - kill bogus BUG_ON which broke ums+gem from Willy Tarreau (gasp, people are still using this!)" * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: fix non-DP-D eDP backlight cleanup and module reload drm/i915: HSW CRW stability magic drm/i915/dvo-ch7xxx: fix get_hw_state drm/i915: fixup the plane->pipe fixup code drm/i915: rip out the pipe A quirk for i855gm drm/i915: disable wc gtt pte mappings on gen2 drm/i915: fixup i915_gem_object_get_page inline helper drm/i915: Disallow preallocation of requests drm/i915: Set guardband clipping workaround bit in the right register. drm/i915: paper over a pipe-enable vs pageflip race drm/i915: remove useless BUG_ON which caused a regression in 3.5.	2012-10-16 10:11:59 +10:00
Rodrigo Vivi	eda2d7f582	drm/i915: HSW CRW stability magic This magic brings stability to HSW CRW machines. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-12 10:59:11 +02:00
Chris Wilson	acb868d3d7	drm/i915: Disallow preallocation of requests The intention was to allow the caller to avoid a failure to queue a request having already written commands to the ring. However, this is a moot point as the i915_add_request() can fail for other reasons than a mere allocation failure and those failure cases are more likely than ENOMEM. So the overlay code already had to handle i915_add_request() failures, and due to commit `3bb73aba1e` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Fri Jul 20 12:40:59 2012 +0100 drm/i915: Allow late allocation of request for i915_add_request() the error handling code in intel_overlay.c was subject to causing double-frees, as found by coverity. Rather than further complicate i915_add_request() and callers, realise the battle is lost and adapt intel_overlay.c to take advantage of the late allocation of requests. v2: Handle callers passing in a NULL seqno. v3: Ditto. This time for sure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-12 10:59:09 +02:00
Chris Wilson	bcb4508616	drm/i915: Align the retire_requests worker to the nearest second By using round_jiffies() we can align the wakeup of our worker to the nearest second in order to batch wakeups and reduce system load, which is useful for unimportant coarse tasks like our retire_requests. v2: round_jiffies_relative() already returns the relative timeout value, so no need to incorrectly perform the subtraction twice. The timer interface still leaves the possibility for the value of jiffies to change be we program the timer. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-08 18:45:21 +02:00
Chris Wilson	cecc21fea9	drm/i915: Align the hangcheck wakeup to the nearest second round_jiffies() aligns the wakeup time to the nearest second in order to batch wakeups and reduce system load, which is useful for unimportant coarse timers like our hangcheck. v2: round_jiffies_relative() returns the relative jiffie value, whereas we need the absolute value for the timer. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Arjan van de Ven <arjan@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-08 18:44:36 +02:00
Willy Tarreau	c77d7162a7	drm/i915: remove useless BUG_ON which caused a regression in 3.5. starting an old X server causes a kernel BUG since commit `1b50247a8d`: ------------[ cut here ]------------ kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3661! invalid opcode: 0000 [#1] SMP Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss uvcvideo +videobuf2_core videodev videobuf2_vmalloc videobuf2_memops uhci_hcd ath9k mac80211 snd_hda_codec_realtek ath9k_common microcode +ath9k_hw psmouse serio_raw sg ath cfg80211 atl1c lpc_ich mfd_core ehci_hcd snd_hda_intel snd_hda_codec snd_hwdep snd_pcm rtc_cmos +snd_timer snd evdev eeepc_laptop snd_page_alloc sparse_keymap Pid: 2866, comm: X Not tainted 3.5.6-rc1-eeepc #1 ASUSTeK Computer INC. 1005HA/1005HA EIP: 0060:[<c12dc291>] EFLAGS: 00013297 CPU: 0 EIP is at i915_gem_entervt_ioctl+0xf1/0x110 EAX: f5941df4 EBX: f5940000 ECX: 00000000 EDX: 00020000 ESI: f5835400 EDI: 00000000 EBP: f51d7e38 ESP: f51d7e20 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: b760e0a0 CR3: 351b6000 CR4: 000007d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 Process X (pid: 2866, ti=f51d6000 task=f61af8d0 task.ti=f51d6000) Stack: 00000001 00000000 f5835414 f51d7e84 f5835400 f54f85c0 f51d7f10 c12b530b 00000001 c151b139 c14751b6 c152e030 00000b32 00006459 00000059 0000e200 00000001 00000000 00006459 c159ddd0 c12dc1a0 ffffffea 00000000 00000000 Call Trace: [<c12b530b>] drm_ioctl+0x2eb/0x440 [<c12dc1a0>] ? i915_gem_init+0xe0/0xe0 [<c1052b2b>] ? enqueue_hrtimer+0x1b/0x50 [<c1053321>] ? __hrtimer_start_range_ns+0x161/0x330 [<c10530b3>] ? lock_hrtimer_base+0x23/0x50 [<c1053163>] ? hrtimer_try_to_cancel+0x33/0x70 [<c12b5020>] ? drm_version+0x90/0x90 [<c10ca171>] vfs_ioctl+0x31/0x50 [<c10ca2e4>] do_vfs_ioctl+0x64/0x510 [<c10535de>] ? hrtimer_nanosleep+0x8e/0x100 [<c1052c20>] ? update_rmtp+0x80/0x80 [<c10ca7c9>] sys_ioctl+0x39/0x60 [<c1433949>] syscall_call+0x7/0xb Code: 83 c4 0c 5b 5e 5f 5d c3 c7 44 24 04 2c 05 53 c1 c7 04 24 6f ef 47 c1 e8 6e e0 fd ff c7 83 38 1e 00 00 00 00 00 00 e9 3f ff ff +ff <0f> 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 0f 0b eb fe 8d b6 EIP: [<c12dc291>] i915_gem_entervt_ioctl+0xf1/0x110 SS:ESP 0068:f51d7e20 ---[ end trace dd332ec083cbd513 ]--- The crash happens here in i915_gem_entervt_ioctl() : 3659 BUG_ON(!list_empty(&dev_priv->mm.active_list)); 3660 BUG_ON(!list_empty(&dev_priv->mm.flushing_list)); -> 3661 BUG_ON(!list_empty(&dev_priv->mm.inactive_list)); 3662 mutex_unlock(&dev->struct_mutex); Quoting Chris : "That BUG_ON there is silly and can simply be removed. The check is to verify that no batches were submitted to the kernel whilst the UMS/GEM client was suspended - to which the BUG_ONs are a crude approximation. Furthermore, the checks are too late, since it means we attempted to program the hardware whilst it was in an invalid state, the BUG_ONs are the least of your concerns at that point." Note that this regression has been introduced in commit `1b50247a8d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 24 15:47:30 2012 +0100 drm/i915: Remove the list of pinned inactive objects Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Willy Tarreau <w@1wt.eu> [danvet: Added note about the regressing commit and cc: stable.] Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-07 22:57:25 +02:00
Dave Airlie	1f31c69dac	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: Bigger -fixes pile, mostly because I've included Ajax' DP dongle stuff, as discussed on irc. Otherwise just small things: - regression fix to finally make 6bpc auto-dither on dp work (Jani) - reinstate an snb ctx w/a that accidentally got lost in a rework (Chris) - fixup the DP train sequence, logic-goof-up uncovered by Coverty (Chris) - fix set_caching locking (Ben) - fix spurious segfault on con-current gtt mmap faulting (Dimitry and Mika) - some pageflip correctness fixes (still hunting down some issues, but these are the worst offenders of confused code that we've tracked down thus far) from Chris and me - fixup swizzling settings on vlv (Jesse) - gt_mode w/a from Ben added, fixes snb gt1 rc6+hw ctx hangs. * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Fix GT_MODE default value drm/i915: don't frob the vblank ts in finish_page_flip drm/i915: call drm_handle_vblank before finish_page_flip drm/i915: print warning if vmi915_gem_fault error is not handled drm/i915: EBUSY status handling added to i915_gem_fault(). drm/i915: Try harder to complete DP training pattern 1 drm/i915: set swizzling to none on VLV drm/dp: Make sink count DP 1.2 aware drm/dp: Document DP spec versions for various DPCD registers drm/i915/dp: Be smarter about connection sense for branch devices drm/i915/dp: Fetch downstream port info if needed during DPCD fetch drm/dp: Update DPCD defines drm: Export drm_probe_ddc() drm/i915: Flush the pending flips on the CRTC before modification drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a drm/i915: Fix set_caching locking drm/i915: use adjusted_mode instead of mode for checking the 6bpc force flag	2012-10-07 21:13:54 +10:00
Mika Kuoppala	4d0f817e74	drm/i915: print warning if vmi915_gem_fault error is not handled Falling into default case in vmi915_gem_fault is a bug. Be more verbose about it. Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-04 10:33:42 +02:00
Dmitry Rogozhkin	e79e0fe380	drm/i915: EBUSY status handling added to i915_gem_fault(). Subsequent threads returning EBUSY from vm_insert_pfn() was not handled correctly. As a result concurrent access from new threads to mmapped data caused SIGBUS. Note that this fixes i-g-t/tests/gem_threaded_tiled_access. Tested-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-10-04 10:33:42 +02:00
Linus Torvalds	612a9aab56	Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux Pull drm merge (part 1) from Dave Airlie: "So first of all my tree and uapi stuff has a conflict mess, its my fault as the nouveau stuff didn't hit -next as were trying to rebase regressions out of it before we merged. Highlights: - SH mobile modesetting driver and associated helpers - some DRM core documentation - i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write combined pte writing, ilk rc6 support, - nouveau: major driver rework into a hw core driver, makes features like SLI a lot saner to implement, - psb: add eDP/DP support for Cedarview - radeon: 2 layer page tables, async VM pte updates, better PLL selection for > 2 screens, better ACPI interactions The rest is general grab bag of fixes. So why part 1? well I have the exynos pull req which came in a bit late but was waiting for me to do something they shouldn't have and it looks fairly safe, and David Howells has some more header cleanups he'd like me to pull, that seem like a good idea, but I'd like to get this merge out of the way so -next dosen't get blocked." Tons of conflicts mostly due to silly include line changes, but mostly mindless. A few other small semantic conflicts too, noted from Dave's pre-merged branch. * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits) drm/nv98/crypt: fix fuc build with latest envyas drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering drm/nv41/vm: fix and enable use of "real" pciegart drm/nv44/vm: fix and enable use of "real" pciegart drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie drm/nouveau: store supported dma mask in vmmgr drm/nvc0/ibus: initial implementation of subdev drm/nouveau/therm: add support for fan-control modes drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules drm/nouveau/therm: calculate the pwm divisor on nv50+ drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster drm/nouveau/therm: move thermal-related functions to the therm subdev drm/nouveau/bios: parse the pwm divisor from the perf table drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices drm/nouveau/therm: rework thermal table parsing drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table drm/nouveau: fix pm initialization order drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it drm/nouveau: log channel debug/error messages from client object rather than drm client drm/nouveau: have drm debugging macros build on top of core macros ...	2012-10-03 23:29:23 -07:00
David Howells	760285e7e7	UAPI: (Scripted) Convert #include "..." to #include <path/...> in drivers/gpu/ Convert #include "..." to #include <path/...> in drivers/gpu/. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:07 +01:00
David Howells	4126d5d61f	UAPI: (Scripted) Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant DRM UAPI header #inclusions from drivers/gpu/. Remove redundant #inclusions of core DRM UAPI headers (drm.h, drm_mode.h and drm_sarea.h). They are now #included via drmP.h and drm_crtc.h via a preceding patch. Without this patch and the patch to make include the UAPI headers from the core headers, after the UAPI split, the DRM C sources cannot find these UAPI headers because the DRM code relies on specific -I flags to make #include "..." work on headers in include/drm/ - but that does not work after the UAPI split without adding more -I flags. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dave Jones <davej@redhat.com>	2012-10-02 18:01:05 +01:00
Ben Widawsky	3bc2913e2c	drm/i915: Fix set_caching locking On the EINVAL case we don't release struct_mutex. It should be safe to grab the lock after checking the parameters, which also resolves the issues. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-27 08:45:11 +02:00
Ben Widawsky	199adf40ae	drm/i915: s/cacheing/caching/ Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-26 09:24:36 +02:00
Daniel Vetter	398b7a1b88	Linux 3.6-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJQX7MuAAoJEHm+PkMAQRiG0h0IAJURkrMCAQUxA+Ik66ReH89s LQcVd0U9uL4UUOi7f5WR64Vf9Cfu6VVGX9ZKSvjpNskvlQaUQPMIt4pMe6g4X4dI u0bApEy4XZz3nGabUAghIU8jJ8cDmhCG6kPpSiS7pi7KHc0yIa4WFtJRrIpGaIWT xuK38YOiOHcSDRlLyWZzainMncQp/ixJdxnqVMTonkVLk0q0b84XzOr4/qlLE5lU i+TsK3PRKdQXgvZ4CebL+srPBwWX1dmgP3VkeBloQbSSenSeELICbFWavn2ml+sF GXi4dO93oNquL/Oy5SwI666T4uNcrRPaS+5X+xSZgBW/y2aQVJVJuNZg6ZP/uWk= =0v2l -----END PGP SIGNATURE----- Merge tag 'v3.6-rc7' into drm-intel-next-queued Manual backmerge of -rc7 to resolve a silent conflict leading to compile failure in drivers/gpu/drm/i915/intel_hdmi.c. This is due to the bugfix in -rc7: commit `b98b601672` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 07:43:22 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug Since this code moved around a lot in -next git put that snippet at the wrong spot. I've tried to fix this by making the conflict explicit by merging a version for next with: commit `3cce574f01` Author: Wang Xingchao <xingchao.wang@intel.com> Date: Thu Sep 13 11:19:00 2012 +0800 drm/i915: HDMI - Clear Audio Enable bit for Hot Plug unconditionally But that failed to solve the entire problem. To avoid pushing out further -nightly branch to our QA where this is broken, do the backmerge and manually add the stuff git adds to -next from the patch in -fixes. Note that this doesn't show up in git's merge diff (and hence is also not handled by git rerere), which adds to the reasons why I'd like to fix this with a verbose backmerge. The git merge diff only shows a bunch of trivial conflicts of the "code changed in lines next to each another" kind. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-24 18:17:12 +02:00
Chris Wilson	2f745ad3d3	drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops By providing a callback for when we need to bind the pages, and then release them again later, we can shorten the amount of time we hold the foreign pages mapped and pinned, and importantly the dmabuf objects then behave as any other normal object with respect to the shrinker and memory management. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:23:10 +02:00
Chris Wilson	9da3da660d	drm/i915: Replace the array of pages with a scatterlist Rather than have multiple data structures for describing our page layout in conjunction with the array of pages, we can migrate all users over to a scatterlist. One major advantage, other than unifying the page tracking structures, this offers is that we replace the vmalloc'ed array (which can be up to a megabyte in size) with a chain of individual pages which helps reduce memory pressure. The disadvantage is that we then do not have a simple array to iterate, or to access randomly. The common case for this is in the relocation processing, which will typically fit within a single scatterlist page and so be almost the same cost as the simple array. For iterating over the array, the extra function call could be optimised away, but in reality is an insignificant cost of either binding the pages, or performing the pwrite/pread. v2: Fix drm_clflush_sg() to not invoke wbinvd as well! And fix the trivial compile error from rebasing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	f60d7f0c1d	drm/i915: Pin backing pages for pread By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear beneath us, and so we can simplify the code for iterating over the pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:57 +02:00
Chris Wilson	755d22184f	drm/i915: Pin backing pages for pwrite By using the recently introduced pinning of pages, we can safely drop the mutex in the knowledge that the pages are not going to disappear jeneath us, and so we can simplify the code for iterating over the pages. Note: The old code had such complicated page refcounting since it used obj->pages as a micro-optimization if it's there, but that could (before this patch) disappear when we drop the dev->struct_mutex. Hence some manual page refcounting was required for the slow path, complicated by the fact that pages returned by shmem_read_mapping_page already have a pageref, which needs to be dropped again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Added note to explain the question Ben raised in review.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	a5570178c0	drm/i915: Pin backing pages whilst exporting through a dmabuf vmap We need to refcount our pages in order to prevent reaping them at inopportune times, such as when they currently vmapped or exported to another driver. However, we also wish to keep the lazy deallocation of our pages so we need to take a pin/unpinned approach rather than a simple refcount. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:56 +02:00
Chris Wilson	37e680a15f	drm/i915: Introduce drm_i915_gem_object_ops In order to specialise functions depending upon the type of object, we can attach vfuncs to each object via a new ->ops pointer. For instance, this will be used in future patches to only bind pages from a dma-buf for the duration that the object is used by the GPU - and so prevent them from pinning those pages for the entire of the object. v2: Bonus comments. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-20 14:22:55 +02:00
Chris Wilson	7e81a42e34	drm/i915: Reduce a pin-leak BUG into a WARN Pin-leaks persist and we get the perennial bug reports of machine lockups to the BUG_ON(pin_count==MAX). If we instead loudly report that the object cannot be pinned at that time it should prevent the driver from locking up, and hopefully restore a semblance of working whilst still leaving us a OOPS to debug. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-09-17 10:12:57 +02:00
Dave Airlie	65983bd605	Merge branch 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel into drm-next Daniel writes: "New stuff for -next. Highlights: - prep patches for the modeset rework. Note that one of those patches touches the fb helper in the common drm code. - hasw hdmi audio support (Wang Xingchao) - improved instdone dumping for gen7 (Ben) - unbound tracking and a few follow-up patches from Chris - dma_buf->begin/end_cpu_access plus fix for drm/udl (Dave) - improve mmio error reporting for hsw - prep patch for WQ_NON_REENTRANT removal (Tejun Heo) " * 'for-airlied' of git://people.freedesktop.org/~danvet/drm-intel: (41 commits) drm/i915: Remove __GFP_NO_KSWAPD drm/i915: disable rc6 on ilk when vt-d is enabled drm/i915: Avoid unbinding due to an interrupted pin_and_fence during execbuffer drm/i915: Use new INSTDONE registers (Gen7+) drm/i915: Add new INSTDONE registers drm/i915: Extract reading INSTDONE drm/i915: Use a non-blocking wait for set-to-domain ioctl drm/i915: Juggle code order to ease flow of the next patch drm/i915: Use cpu relocations if the object is in the GTT but not mappable drm/i915: Extract general object init routine drm/i915: Protect private gem objects from truncate (such as imported dmabuf) drm/i915: Only pwrite through the GTT if there is space in the aperture i915: use alloc_ordered_workqueue() instead of explicit UNBOUND w/ max_active = 1 drm/i915: Find unclaimed MMIO writes. drm/i915: Add ERR_INT to gen7 error state drm/i915: Cantiga+ cannot handle a hsync front porch of 0 drm/i915: fix reassignment of variable "intel_dp->DP" drm/i915: Try harder to allocate an mmap_offset drm/i915: Show pin count in debugfs drm/i915: Show (count, size) of purgeable objects in i915_gem_objects ...	2012-09-03 12:05:01 +10:00
Sedat Dilek	d7c3b937bd	drm/i915: Remove __GFP_NO_KSWAPD When I pulled-in today's drm-intel-next into linux-next (next-20120824) I saw this build-breakage: drivers/gpu/drm/i915/i915_gem.c: In function 'i915_gem_object_get_pages_gtt': drivers/gpu/drm/i915/i915_gem.c:1778:40: error: '__GFP_NO_KSWAPD' undeclared (first use in this function) drivers/gpu/drm/i915/i915_gem.c:1778:40: note: each undeclared identifier is reported only once for each function it appears in This is caused by commit ba099ef165f8 ("mm: remove __GFP_NO_KSWAPD") and commit b6beae2c2014 ("mm: remove __GFP_NO_KSWAPD fixes") in linux-next (next-20120824). Fix this by removing __GFP_NO_KSWAPD from drm/i915 driver. Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-27 17:11:38 +02:00
Dave Airlie	93bb70e0c0	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next There was some merge conflicts in -next and they weren't so pretty, so backmerge now to avoid them. Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_modes.c	2012-08-27 16:22:20 +10:00
Chris Wilson	3236f57a01	drm/i915: Use a non-blocking wait for set-to-domain ioctl The principal use for set-to-domain is for userspace to serialise operations with a particular buffer, for example to maintain coherency with a CPU map or to ratelimit its rendering by waiting on all previous operations before continuing. As such we tend to hold the struct_mutex for long periods during the synchronisation and so cause contention issues with other users of the graphics device, even for independent operations as memory management. An example is the contention between compiz and X which causes jitter in the display and a drop in peak throughput. The ultimate solution would be a set of fine grained locks and lockless operations, but an intermediate step is to first attempt the synchronisation for set-to-domain without holding the mutex. This introduces a number of race conditions, so we limit it use to the ioctl periphery where we have no dependent state and can safely complete with a locked synchronisation afterwards. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:56 +02:00
Chris Wilson	b361237bcc	drm/i915: Juggle code order to ease flow of the next patch Move the wait-for-rendering logic around in the file so that we can group it together with the subsequent variations. The general goal is to have the lower level routines clustered together and then the higher level logic building upon those low level routines that came before. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 11:12:53 +02:00
Chris Wilson	0327d6ba99	drm/i915: Extract general object init routine As we wish to create specialised object constructions in the near future that share the same basic GEM object struct, export the default initializer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:38 +02:00
Chris Wilson	4d6294bf77	drm/i915: Protect private gem objects from truncate (such as imported dmabuf) If the object has no backing shmemfs filp, then we obviously cannot perform a truncation operation upon it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:04:31 +02:00
Chris Wilson	86a1ee26bb	drm/i915: Only pwrite through the GTT if there is space in the aperture Avoid stalling and waiting for the GPU by checking to see if there is sufficient inactive space in the aperture for us to bind the buffer prior to writing through the GTT. If there is inadequate space we will have to stall waiting for the GPU, and incur overheads moving objects about. Instead, only incur the clflush overhead on the target object by writing through shmem. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-24 02:03:33 +02:00
Chris Wilson	d8cb508669	drm/i915: Try harder to allocate an mmap_offset Given the persistence of an offset for the lifetime of an object, itis easy to contemplate how the mmap space becomes badly fragmented to the point that further allocations fail with ENOSPC. Our only recourse at this point is to try to purge the objects to release some space and reattempt the allocation. References: https://bugs.freedesktop.org/show_bug.cgi?id=39552 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:36 +02:00
Chris Wilson	c4670ad080	drm/i915: Add some sanity checks to unbound tracking A pair of universally true checks that just need to be put in the right place depending on where in the patch sequence you go. Note that i915_gem_object_put_pages_gtt() already gains the BUG_ON(obj->gtt_space), but on reflection that needed to migrate to put_pages(). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:20 +02:00
Chris Wilson	6c085a728c	drm/i915: Track unbound pages When dealing with a working set larger than the GATT, or even the mappable aperture when touching through the GTT, we end up with evicting objects only to rebind them at a new offset again later. Moving an object into and out of the GTT requires clflushing the pages, thus causing a double-clflush penalty for rebinding. To avoid having to clflush on rebinding, we can track the pages as they are evicted from the GTT and only relinquish those pages on memory pressure. As usual, if it were not for the handling of out-of-memory condition and having to manually shrink our own bo caches, it would be a net reduction of code. Alas. Note: The patch also contains a few changes to the last-hope evict_everything logic in i916_gem_execbuffer.c - we no longer try to only evict the purgeable stuff in a first try (since that's superflous and only helps in OOM corner-cases, not fragmented-gtt trashing situations). Also, the extraction of the get_pages retry loop from bind_to_gtt (and other callsites) to get_pages should imo have been a separate patch. v2: Ditch the newly added put_pages (for unbound objects only) in i915_gem_reset. A quick irc discussion hasn't revealed any important reason for this, so if we need this, I'd like to have a git blame'able explanation for it. v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Split out code movements and rant a bit in the commit message with a few Notes. Done v2] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-21 14:34:11 +02:00
Daniel Vetter	225067eedf	drm/i915: move functions around Prep work to make Chris Wilson's unbound tracking patch a bit easier to read. Alas, I'd have preferred that moving the page allocation retry loop from bind to get_pages would have been a separate patch, too. But that looks like real work ;-) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-20 10:59:41 +02:00
Ben Widawsky	b6c7488df6	drm/i915/contexts: fix list corruption After reset we unconditionally reinitialize lists. If the context switch hasn't yet completed before the suspend, the default context object will end up on lists that are going to go away when we resume. The patch forces the context switch to be synchronous before suspend assuring that the active/inactive tracking is correct at the time of resume. References: https://bugs.freedesktop.org/show_bug.cgi?id=52429 Tested-by: Guang A Yang <guang.a.yang@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-17 09:21:34 +02:00
Chris Wilson	b2eadbc85b	drm/i915: Lazily apply the SNB+ seqno w/a Avoid the forcewake overhead when simply retiring requests, as often the last seen seqno is good enough to satisfy the retirment process and will be promptly re-run in any case. Only ensure that we force the coherent seqno read when we are explicitly waiting upon a completion event to be sure that none go missing, and also for when we are reporting seqno values in case of error or debugging. This greatly reduces the load for userspace using the busy-ioctl to track active buffers, for instance halving the CPU used by X in pushing the pixels from a software render (flash). The effect will be even more magnified with userptr and so providing a zero-copy upload path in that instance, or in similar instances where X is simply compositing DRI buffers. v2: Reverse the polarity of the tachyon stream. Daniel suggested that 'force' was too generic for the parameter name and that 'lazy_coherency' better encapsulated the semantics of it being an optimization and its purpose. Also notice that gen6_get_seqno() is only used by gen6/7 chipsets and so the test for IS_GEN6 \|\| IS_GEN7 is redundant in that function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-08-10 11:11:32 +02:00
Chris Wilson	e6994aeedc	drm/i915: Export ability of changing cache levels to userspace By selecting the cache level (essentially whether or not the CPU snoops any updates to the bo, and on more recent machines whether it resides inside the CPU's last-level-cache) a userspace driver is able to then manage all of its memory within buffer objects, if it so desires. This enables the userspace driver to accelerate uploads and more importantly downloads from the GPU and to able to mix CPU and GPU rendering/activity efficiently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added code comment about where we plan to stuff platform specific cacheing control bits in the ioctl struct.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Chris Wilson	42d6ab4839	drm/i915: Segregate memory domains in the GTT using coloring Several functions of the GPU have the restriction that differing memory domains cannot be placed next to each other (as the GPU may prefetch beyond the end of one domain and hang as it crosses into the other domain). We use the facility of the drm_mm to mark ranges with a particular color that corresponds to the cache attributes of those pages in order to prevent allocating adjacent blocks of differing memory types. v2: Rebase ontop of drm_mm coloring v2. v3: Fix rebinding existing gtt_space and add a verification routine. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-26 12:56:25 +02:00
Chris Wilson	f047e395dd	drm/i915: Avoid concurrent access when marking the device as idle/busy As suggested by Daniel, rip out the independent timers for device and crtc busyness and integrate the manual powermanagement of the display engine into the GEM core and its request tracking. The benefits are that the code is a lot smaller, fewer moving parts and should fit more neatly into the overall activity tracking of the driver. v2: Complete overhaul and removal of the racy timers and workers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:56 +02:00
Chris Wilson	a7b9761d0a	drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs By moving the function to intel_ringbuffer and currying the appropriate parameter, hopefully we make the callsites easier to read and understand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:55 +02:00
Chris Wilson	26b9c4a57f	drm/i915: Remove the explicit flush of the GPU write domain Rely instead on the insertion of the implicit flush before the seqno breadcrumb. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:54 +02:00
Chris Wilson	86d5bc3782	drm/i915: Remove explicit flush from i915_gem_object_flush_fence() As the flush is either performed explictly immediately after the execbuffer dispatch, or before the serialisation of last_fenced_seqno we can forgo the explict i915_gem_flush_ring(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	69c2fc8913	drm/i915: Remove the per-ring write list This is now handled by a global flag to ensure we emit a flush before the next serialisation point (if we failed to queue one previously). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:53 +02:00
Chris Wilson	65ce302741	drm/i915: Remove the defunct flushing list As we guarantee to emit a flush before emitting the breadcrumb or the next batchbuffer, there is no further need for the flushing list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	0201f1ecf4	drm/i915: Replace the pending_gpu_write flag with an explicit seqno As we always flush the GPU cache prior to emitting the breadcrumb, we no longer have to worry about the deferred flush causing the pending_gpu_write to be delayed. So we can instead utilize the known last_write_seqno to hopefully minimise the wait times. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:52 +02:00
Chris Wilson	e5f1d962a8	drm/i915: Remove assertion over write domain after i915_gem_object_sync() As we move to lazily clearing the GPU write domain only when the buffer becomes inactive, this leaves a window of opportunity for i915_gem_object_pin_to_display_plane() to detect a seemingly inconsistent value. This function is special as it tries to pipeline the operation to avoid the stall and so may not retires the buffer and we may not get the opportunity to clear the write domain. However, we know all is good, so drop the assertion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Chris Wilson	3bb73aba1e	drm/i915: Allow late allocation of request for i915_add_request() Request preallocation was added to i915_add_request() in order to support the overlay. However, not all users care and can quite happily ignore the failure to allocate the request as they will simply repeat the request in the future. By pushing the allocation down into i915_add_request(), we can then remove some rather ugly error handling in the callers. v2: Nullify request->file_priv otherwise we chase a garbage pointer when retiring requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:51 +02:00
Chris Wilson	e9808edd98	drm/i915: Return a mask of the active rings in the high word of busy_ioctl The intention is to help select which engine to use for copies with interoperating clients - such as a GL client making a request to the X server to perform a SwapBuffers, which may require copying from the active GL back buffer to the X front buffer. We choose to report a mask of the active rings to future proof the interface against any changes which may allow for the object to reside upon multiple rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: bikeshed away the write ring mask and add the explanation Chris sent in a follow-up mail why we decided to use masks.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 18:23:50 +02:00
Chris Wilson	eeef9b3874	drm/i915: Add -EIO to the list of known errors for __wait_seqno This prevents a WARN introduced with commit `de2b998552` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Jul 4 22:52:50 2012 +0200 drm/i915: don't return a spurious -EIO from intel_ring_begin Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-25 10:39:57 +02:00
Chris Wilson	67b1b57182	drm/i915: Disable the BLT on pre-production SNB hardware It never quite worked despite the numerous workarounds, yet I still see people trying to use this hardware and filing bug reports. As we no longer even try to implement the workarounds, since `6a233c7887` (drm/i915/ringbuffer: kill snb blt workaround), simply disable the ring. v2: Add a message to inform the user about the limited capabilities of their pre-production hardware. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-20 12:21:37 +02:00
Chris Wilson	6b9d89b436	drm: Add colouring to the range allocator In order to support snoopable memory on non-LLC architectures (so that we can bind vgem objects into the i915 GATT for example), we have to avoid the prefetcher on the GPU from crossing memory domains and so prevent allocation of a snoopable PTE immediately following an uncached PTE. To do that, we need to extend the range allocator with support for tracking and segregating different node colours. This will be used by i915 to segregate memory domains within the GTT. v2: Now with more drm_mm helpers and less driver interference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>	2012-07-16 05:59:37 +10:00
Daniel Vetter	a9340ccab5	drm/i915: properly SIGBUS on I/O errors ... instead of looping endless with no hope of ever serving that page-fault. We only need to break out of this loop when the gpu died, to run the reset work (and hopefully resurrect it). To clarify questions Chris raised on irc: This is about handling I/O errors not from our own code, but e.g. when the disk died when trying to swap in a gem bo. So this patch remidies the issue that the current handling only handles gpu-death-induced cases of -EIO. Admittedly, dying disks are much rarer than hanging gpus ...To clarify questions Chris raised on irc: This is about handling I/O errors not from our own code, but e.g. when the disk died when trying to swap in a gem bo. So this patch remidies the issue that the current handling only handles gpu-death-induced cases of -EIO. Admittedly, dying disks are much rarer than hanging gpus ... This seems to have been lost in: commit `d9bc7e9f32` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Feb 7 13:09:31 2011 +0000 drm/i915: Fix infinite loop regression from `21dd3734` Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:03:01 +02:00
Daniel Vetter	0a6759c6ba	drm/i915: don't hang userspace when the gpu reset is stuck With the gpu reset no longer using a trylock we've increased the chances of userspace getting stuck quite a bit. To make that (hopefully) rare case more paletable time out when waiting for the gpu reset code to complete and signal this little issue to the caller by returning -EIO. This should help userspace to somewhat gracefully fall back and hopefully allow the user to grab some logs and reboot the machine (instead of staring at a frozen X screen in agony). Suggested by Chris Wilson because I've been stubborn about allowing the gpu reset code no to fail, ever (by removing the trylock). Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:02:24 +02:00
Daniel Vetter	d6b2c790a4	drm/i915: non-interruptible sleeps can't handle -EAGAIN So don't return -EAGAIN, even in the case of a gpu hang. Remap it to -EIO instead. Note that this isn't really an issue with interruptability, but more that we have quite a few codepaths (mostly around kms stuff) that simply can't handle any errors and hence not even -EAGAIN. Instead of adding proper failure paths so that we could restart these ioctls we've opted for the cheap way out of sleeping non-interruptibly. Which works everywhere but when the gpu dies, which this patch fixes. So essentially interruptible == false means 'wait for the gpu or die trying'.' This patch is a bit ugly because intel_ring_begin is all non-interruptible and hence only returns -EIO. But as the comment in there says, auditing all the callsites would be a pain. To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno and intel_wait_ring_buffer. Also use the opportunity to clarify the different cases in i915_gem_check_wedge a bit with comments. v2: Don't access dev_priv->mm.interruptible from check_wedge - we might not hold dev->struct_mutex, making this racy. Instead pass interruptible in as a parameter. I've noticed this because I've hit a BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been added in commit `b4aca0106c` Author: Ben Widawsky <ben@bwidawsk.net> Date: Wed Apr 25 20:50:12 2012 -0700 drm/i915: extract some common olr+wedge code although that commit is missing any justification for this. I guess it's just copy&paste, because the same commit add the same BUG_ON check to check_olr, where it indeed makes sense. But in check_wedge everything we access is protected by other means, so this is superflous. And because it now gets in the way (we add a new caller in __wait_seqno, which can be called without dev->struct_mutext) let's just remove it. v3: Group all the i915_gem_check_wedge refactoring into this patch, so that this patch here is all about not returning -EAGAIN to callsites that can't handle syscall restarting. v4: Add clarification what interuptible == fales means in our code, requested by Ben Widawsky. v5: Fix EAGAIN mispell noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-07-05 10:01:14 +02:00
Daniel Vetter	cc889e0f6c	drm/i915: disable flushing_list/gpu_write_list This is just the minimal patch to disable all this code so that we can do decent amounts of QA before we rip it all out. The complicating thing is that we need to flush the gpu caches after the batchbuffer is emitted. Which is past the point of no return where execbuffer can't fail any more (otherwise we risk submitting the same batch multiple times). Hence we need to add a flag to track whether any caches associated with that ring are dirty. And emit the flush in add_request if that's the case. Note that this has a quite a few behaviour changes: - Caches get flushed/invalidated unconditionally. - Invalidation now happens after potential inter-ring sync. I've bantered around a bit with Chris on irc whether this fixes anything, and it might or might not. The only thing clear is that with these changes it's much easier to reason about correctness. Also rip out a lone get_next_request_seqno in the execbuffer retire_commands function. I've dug around and I couldn't figure out why that is still there, with the outstanding lazy request stuff it shouldn't be necessary. v2: Chris Wilson complained that I also invalidate the read caches when flushing after a batchbuffer. Now optimized. v3: Added some comments to explain the new flushing behaviour. Cc: Eric Anholt <eric@anholt.net> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-20 13:54:28 +02:00
Ben Widawsky	f2ef6eb145	drm/i915: switch to default context on idle To keep things as sane as possible, switch to the default context before idling. This should help free context objects, as well as put things in a more well defined state before suspending. v2: remove seqno from context switch call (daniel) return error on failed context switch instead of WARN+continue (daniel) v3: move idling to i915_gpu idle (from i915_gem_idle) (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:20 +02:00
Ben Widawsky	254f965c39	drm/i915: preliminary context support Very basic code for context setup/destruction in the driver. Adds the file i915_gem_context.c This file implements HW context support. On gen5+ a HW context consists of an opaque GPU object which is referenced at times of context saves and restores. With RC6 enabled, the context is also referenced as the GPU enters and exists from RC6 (GPU has it's own internal power context, except on gen5). Though something like a context does exist for the media ring, the code only supports contexts for the render ring. In software, there is a distinction between contexts created by the user, and the default HW context. The default HW context is used by GPU clients that do not request setup of their own hardware context. The default context's state is never restored to help prevent programming errors. This would happen if a client ran and piggy-backed off another clients GPU state. The default context only exists to give the GPU some offset to load as the current to invoke a save of the context we actually care about. In fact, the code could likely be constructed, albeit in a more complicated fashion, to never use the default context, though that limits the driver's ability to swap out, and/or destroy other contexts. All other contexts are created as a request by the GPU client. These contexts store GPU state, and thus allow GPU clients to not re-emit state (and potentially query certain state) at any time. The kernel driver makes certain that the appropriate commands are inserted. There are 4 entry points into the contexts, init, fini, open, close. The names are self-explanatory except that init can be called during reset, and also during pm thaw/resume. As we expect our context to be preserved across these events, we do not reinitialize in this case. As Adam Jackson pointed out, The cutoff of 1MB where a HW context is considered too big is arbitrary. The reason for this is even though context sizes are increasing with every generation, they have yet to eclipse even 32k. If we somehow read back way more than that, it probably means BIOS has done something strange, or we're running on a platform that wasn't designed for this. v2: rename load/unload to init/fini (daniel) remove ILK support for get_size() (indirectly daniel) add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel) added comments (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2012-06-14 17:36:16 +02:00
Daniel Vetter	8ecd1a6615	drm/i915: call intel_enable_gtt When drm/i915 is in control of the gtt, we need to call the enable function at all the relevant places ourselves. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:21:07 +02:00
Daniel Vetter	dd2757f8b5	drm/i915: stop using dev->agp->base For that to work we need to export the base address of the gtt mmio window from intel-gtt. Also replace all other uses of dev->agp by values we already have at hand. Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-12 22:18:06 +02:00
Ben Widawsky	eac1f14fd1	drm/i915: Inifite timeout for wait ioctl Change the ns_timeout parameter of the wait ioctl to a signed value. Doing this allows the kernel to provide an infinite wait when a timeout of less than 0 is provided. This mimics select/poll. Initially the parameter was meant to match up with the GL spec 1:1, but after being made aware of how much 2^64 - 1 nanoseconds actually is, I do not think anyone will ever notice the loss of 1 bit. The infinite timeout on waiting is similar to the existing i915 userspace interface with the exception that struct_mutex is dropped while doing the wait in this ioctl. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-06 12:25:46 +02:00
Daniel Vetter	30dfebf34b	drm/i915: extract object active state flushing code Both busy_ioctl and the new wait_ioct need to do the same dance (or at least should). Some slight changes: - busy_ioctl now unconditionally checks for olr. Before emitting a require flush would have prevent the olr check and hence required a second call to the busy ioctl to really emit the request. - the timeout wait now also retires request. Not really required for abi-reasons, but makes a notch more sense imo. I've tested this by pimping the i-g-t test some more and also checking the polling behviour of the wait_rendering_timeout ioctl versus what busy_ioctl returns. v2: Too many people complained about unplug, new color is flush_active. v3: Kill the comment about the unplug moniker. v4: s/un-active/inactive/ Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-02 20:51:03 +02:00
Daniel Vetter	e269f90f3d	Merge remote-tracking branch 'airlied/drm-prime-vmap' into drm-intel-next-queued We need the latest dma-buf code from Dave Airlie so that we can pimp the backing storage handling code in drm/i915 with Chris Wilson's unbound tracking and stolen mem backed gem object code. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-06-01 10:52:54 +02:00
Ben Widawsky	b9524a1e1c	drm/i915: remap l3 on hw init If any l3 rows have been previously remapped, we must remap them after GPU reset/resume too. v2: Just return (no warn) on remapping init if not IVB (Jesse) Move the check of schizo userspace to i915_gem_l3_remap (Jesse) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-31 12:11:29 +02:00
Dave Airlie	a21f976094	Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: tune down the noise of the RP irq limit fail drm/i915: Remove the error message for unbinding pinned buffers drm/i915: Limit page allocations to lowmem (dma32) for i965 drm/i915: always use RPNSWREQ for turbo change requests drm/i915: reject doubleclocked cea modes on dp drm/i915: Adding TV Out Missing modes. drm/i915: wait for a vblank to pass after tv detect drm/i915: no lvds quirk for HP t5740e Thin Client drm/i915: enable vdd when switching off the eDP panel drm/i915: Fix PCH PLL assertions to not assume CRTC:PLL relationship drm/i915: Always update RPS interrupts thresholds along with frequency drm/i915: properly handle interlaced bit for sdvo dtd conversion drm/i915: fix module unload since error_state rework drm/i915: be more careful when returning -ENXIO in gmbus transfer	2012-05-29 11:09:06 +01:00
Ben Widawsky	199b2bc25b	drm/i915: s/i915_wait_request/i915_wait_seqno/g Wait request is poorly named IMO. After working with these functions for some time, I feel it's much clearer to name the functions more appropriately. Of course we must update the callers to use the new name as well. This leaves room within our namespace for a real wait request function at some point. Note to maintainer: this patch is optional. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 14:18:42 +02:00
Ben Widawsky	23ba4fd0a4	drm/i915: wait render timeout ioctl This helps implement GL_ARB_sync but stops short of allowing full blown sync objects. Finally we can use the new timed seqno waiting function to allow userspace to wait on a buffer object with a timeout. This implements that interface. The IOCTL will take as input a buffer object handle, and a timeout in nanoseconds (flags is currently optional but will likely be used for permutations of flush operations). Users may specify 0 nanoseconds to instantly check. The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any non-zero timeout parameter the wait ioctl will wait for the given number of nanoseconds on an object becoming unbusy. Since the wait itself does so holding struct_mutex the object may become re-busied before this completes. A similar but shorter race condition exists in the busy ioctl. v2: ETIME/ERESTARTSYS instead of changing to EBUSY, and EGAIN (Chris) Flush the object from the gpu write domain (Chris + Daniel) Fix leaked refcount in good case (Chris) Naturally align ioctl struct (Chris) v3: Drop lock after getting seqno to avoid ugly dance (Chris) v4: check for 0 timeout after olr check to allow polling (Chris) v5: Updated the comment. (Chris) v6: Return -ETIME instead of -EBUSY when timeout_ns is 0 (Daniel) Fix the commit message comment to be less ugly (Ben) Add a warning to check the return timespec (Ben) v7: Use DRM_AUTH for the ioctl. (Eugeni) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 14:15:46 +02:00
Chris Wilson	31d8d651eb	drm/i915: Remove the error message for unbinding pinned buffers This is now used intentionally to prevent proliferation of is-pinned checks upon the inactive list following: commit `1b50247a8d` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Apr 24 15:47:30 2012 +0100 drm/i915: Remove the list of pinned inactive objects Reported-and-tested-by: guang.a.yang@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50075 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 10:10:40 +02:00
Chris Wilson	bed1ea95a3	drm/i915: Limit page allocations to lowmem (dma32) for i965 Broadwater and Crestline share a limitation that prevent it from relocating general surface state above 4GiB. The only recourse we have since any buffer object may be used as a relocation target is then to limit all object allocations on 965g[m] to DMA32. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 10:07:06 +02:00
Ben Widawsky	5c81fe85da	drm/i915: timeout parameter for seqno wait Insert a wait parameter in the code so we can possibly timeout on a seqno wait if need be. The code should be functionally the same as before because all the callers will continue to retry if an arbitrary timeout elapses. We'd like to have nanosecond granularity, but the only way to do this is with hrtimer, and that doesn't fit well with the needs of this code. v2: Fix rebase error (Chris) Return proper time even in wedged + signal case (Chris + Ben) Use timespec constructs (Ben) Didn't take Daniel's advice regarding the Frankenstein-ness of the function. I did try his advice, but in the end I liked the way the original code looked, better. v3: Make wakeups far less frequent for infinite waits (Chris) v4: Remove dummy_wait variable (Daniel) Use raw monotonic time instead of jiffies (made the code a bit cleaner) (Ben) Added a couple of warnings (Ben) v5: Remove warnings (Daniel) Use more accurate time diff for default case (Daniel) Bikeshed for setting the return timespec in timeout case (Daniel) s/jiffies/time in one of the comments Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-25 09:55:08 +02:00
Daniel Vetter	1286ff7397	i915: add dmabuf/prime buffer sharing support. This adds handle->fd and fd->handle support to i915, this is to allow for offloading of rendering in one direction and outputs in the other. v2 from Daniel Vetter: - fixup conflicts with the prepare/finish gtt prep work. - implement ppgtt binding support. Note that we have squat i-g-t testcoverage for any of the lifetime and access rules dma_buf/prime support brings along. And there are quite a few intricate situations here. Also note that the integration with the existing code is a bit hackish, especially around get_gtt_pages and put_gtt_pages. It imo would be easier with the prep code from Chris Wilson's unbound series, but that is for 3.6. Also note that I didn't bother to put the new prepare/finish gtt hooks to good use by moving the dma_buf_map/unmap_attachment calls in there (like we've originally planned for). Last but not least this patch is only compile-tested, but I've changed very little compared to Dave Airlie's version. So there's a decent chance v2 on drm-next works as well as v1 on 3.4-rc. v3: Right when I've hit sent I've noticed that I've screwed up one obj->sg_list (for dmar support) and obj->sg_table (for prime support) disdinction. We should be able to merge these 2 paths, but that's material for another patch. v4: fix the error reporting bugs pointed out by ickle. v5: fix another error, and stop non-gtt mmaps on shared objects stop pread/pwrite on imported objects, add fake kmap Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-23 10:47:10 +01:00
Chris Wilson	b4519513e8	drm/i915: Introduce for_each_ring() macro In many places we wish to iterate over the rings associated with the GPU, so refactor them to use a common macro. Along the way, there are a few code removals that should be side-effect free and some rearrangement which should only have a cosmetic impact, such as error-state. Note that this slightly changes the semantics in the hangcheck code: We now always cycle through all enabled rings instead of short-circuiting the logic. v2: Pull in a couple of suggestions from Ben and Daniel for intel_ring_initialized() and not removing the warning (just moving them to a new home, closer to the error). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> [danvet: Added note to commit message about the small behaviour change, suggested by Ben Widawsky.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-19 22:39:53 +02:00
Daniel Vetter	5e13a0c5ec	Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued Backmerge of drm-next to resolve a few ugly conflicts and to get a few fixes from 3.4-rc6 (which drm-next has already merged). Note that this merge also restricts the stencil cache lra evict policy workaround to snb (as it should) - I had to frob the code anyway because the CM0_MASK_SHIFT define died in the masked bit cleanups. We need the backmerge to get Paulo Zanoni's infoframe regression fix for gm45 - further bugfixes from him touch the same area and would needlessly conflict. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-08 13:39:59 +02:00
Daniel Vetter	dc257cf154	Linux 3.4-rc6 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S 5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp 6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3 KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa 1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU= =HkMd -----END PGP SIGNATURE----- Merge tag 'v3.4-rc6' into drm-intel-next Conflicts: drivers/gpu/drm/i915/intel_display.c Ok, this is a fun story of git totally messing things up. There /shouldn't/ be any conflict in here, because the fixes in -rc6 do only touch functions that have not been changed in -next. The offending commits in drm-next are 14415745b2..1fa611065 which simply move a few functions from intel_display.c to intel_pm.c. The problem seems to be that git diff gets completely confused: $ git diff 14415745b2..1fa611065 is a nice mess in intel_display.c, and the diff leaks into totally unrelated functions, whereas $git diff --minimal 14415745b2..1fa611065 is exactly what we want. Unfortunately there seems to be no way to teach similar smarts to the merge diff and conflict generation code, because with the minimal diff there really shouldn't be any conflicts. For added hilarity, every time something in that area changes the + and - lines in the diff move around like crazy, again resulting in new conflicts. So I fear this mess will stay with us for a little longer (and might result in another backmerge down the road). Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-07 14:02:14 +02:00
Ben Widawsky	b4aca0106c	drm/i915: extract some common olr+wedge code The new wait_rendering ioctl also needs to check for an oustanding lazy request, and we already duplicate that logic at three places. So extract it. While at it, also extract the code to check the gpu wedging state to improve code flow. v2: Don't use seqno as an outparam (Chris) v3 by danvet: Kill stale comment and pimp commit message Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:32 +02:00
Daniel Vetter	53ca26cab8	drm/i915 disallow physical batchbuffers for KMS Even the horrible gen3 XvMC code has learned to do this right by the time xf86-video-intel releases learned to do kernel modesetting. So we can just disallow this. Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:25 +02:00
Daniel Vetter	8781342df7	drm/i915: create dev_priv->dri1 dragon dungeon^W^W sub-struct ... and shove allow_batchbuffer in there. More dragons will follow suit. There's the curious case that we allow this for KMS ... Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:25 +02:00
Ben Widawsky	3b88cc0dd7	drm/i915: use __wait_seqno for ring throttle It turns out throttle had an almost identical bit of code to do the wait. Now we can call the new helper directly. This is just a bonus, and not needed for the overall series. v2: remove irq_get/put which is now in __wait_seqno (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:23 +02:00
Ben Widawsky	4146b08d76	drm/i915: remove polled wait from throttle It's about to go away anyway. Just here to help bisection. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:22 +02:00
Ben Widawsky	604dd3ec75	drm/i915: extract __wait_seqno from i915_wait_request i915_wait_request is actually a fairly large function encapsulating quite a few different operations. Because being able to wait on seqnos in various conditions is useful, extracting that bit of code to a helper function seems useful v2: pull the irq_get/put as well (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:22 +02:00
Ben Widawsky	c58cf4f108	drm/i915: drop polled waits from i915_wait_request The only time irq_get should fail is during unload or suspend. Both of these points should try to quiesce the GPU before disabling interrupts and so the atomic polling should never occur. This was recommended by Chris Wilson as a way of reducing added complexity to the polled wait which I introduced in an RFC patch. 09:57 < ickle_> it's only there as a fudge for waiting after irqs after uninstalled during s&r, we aren't actually meant to hit it 09:57 < ickle_> so maybe we should just kill the code there and fix the breakage v2: return -ENODEV instead of -EBUSY when irq_get fails Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:22 +02:00
Ben Widawsky	9574b3fe29	drm/i915: kill waiting_seqno The waiting_seqno is not terribly useful, and as such we can remove it so that we'll be able to extract lockless code. v2: Keep the information for error_state (Chris) Check if ring is initialized in hangcheck (Chris) Capture the waiting ring (Chris) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: add some bikeshed to clarify a comment.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:21 +02:00
Ben Widawsky	be998e2e39	drm/i915: move vbetool invoked ier stuff This extra bit of interrupt enabling code doesn't belong in the wait seqno function. If anything we should pull it out to a helper so the throttle code can also use it. The history is a bit vague, but I am going to attempt to just dump it, unless someone can argue otherwise. Removing this allows for a shared lock free wait seqno function. To keep tabs on this issue though, the IER value is stored on error capture (recommended by Chris Wilson) v2: fixed typo EIR->IER (Ben) Fix some white space (Ben) Move IER capture to globally instead of per ring (Ben) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: ier is a 16 bit reg on gen2!] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:21 +02:00
Ben Widawsky	b2da9fe5d5	drm/i915: remove do_retire from i915_wait_request This originates from a hack by me to quickly fix a bug in an earlier patch where we needed control over whether or not waiting on a seqno actually did any retire list processing. Since the two operations aren't clearly related, we should pull the parameter out of the wait function, and make the caller responsible for retiring if the action is desired. The only function call site which did not get an explicit retire_request call (on purpose) is i915_gem_inactive_shrink(). That code was already calling retire_request a second time. v2: don't modify any behavior excepit i915_gem_inactive_shrink(Daniel) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:20 +02:00
Daniel Vetter	507432986c	drm/i915: use the new masked bit macro some more I've missed this one. v2: Chris Wilson noticed another register. v3: Color choice improvements. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:20 +02:00
Daniel Vetter	63ed2cb2d1	drm/i915: rip out GEM drm feature checks We always set it so there's no point in checking. We could instead add a bit that tells us whether gem is actually initialized (i.e. either kms or gem_init_ioctl called), but that's imho not worth it. So just rip it out. There's a little change in the wait_ring timeout, but we've never run with anything else than the 60 second timeout, even on dri1 userspace. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:14 +02:00
Daniel Vetter	7bb6fb8dd9	drm/i915: disallow gem ums init ioctl for kms This ioctl used in a kms driver is only useful to create massive havoc. v2: Bail out with -ENODEV as suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:13 +02:00
Chris Wilson	1070a42b6b	drm/i915: Move GEM initialisation from i915_dma.c to i915_gem.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:12 +02:00
Chris Wilson	1488fc08c1	drm/i915: Remove the deferred-free list The use of the mm_list by deferred-free breaks the following patches to extend the range of objects tracked. We can simplify things if we just make the unbind during free uninterrutible. Note that unbinding should never fail, because we hold an additional reference on every active object. Only the ilk vt-d workaround breaks this, but already takes care of not failing by waiting for the gpu to quiescent non-interruptible. But the existence of the deferred free list casted some doubts on this theory, hence WARN if the unbind fails and only then retry non-interruptible. We can kill this additional code after a release in case the theory is indeed right and no one has hit that WARN. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:11 +02:00
Chris Wilson	1b50247a8d	drm/i915: Remove the list of pinned inactive objects Simplify object tracking by removing the inactive but pinned list. The only place where this was used is for counting the available memory, which is just as easy performed by checking all objects on the rare occasions it is required (application startup). For ease of debugging, we keep the reporting of pinned objects through the error-state and debugfs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:11 +02:00
Chris Wilson	a39d7efc62	drm/i915: Remove i915_gem_evict_inactive() This was only used by one external caller who would just be as happy with evict-everything, so perform the replacement and make the function private. In the process we note that unbinding the inactive list should not fail, and make it a warning instead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:10 +02:00
Chris Wilson	8325a09dd0	drm/i915: Bump the inactive LRU on set-to-GTT-domain Currently, we only bump the inactive LRU of an object when we bind into the GTT for a page-fault. As the object may be used many times before its mapping is zapped, we do not mark it as active as frequently as we should. Userspace should be calling set-to-GTT-domain before each pointer deference (for synchronous access) and so is a good place to mark the buffer as active. Marking the buffer as recently used places it at the end of the inactive eviction queue, though still before anything with outstanding rendering. This reduces the likelihood of evicting a buffer that is going to be used again by the CPU in the near future. This way we can hopefully avoid to kick out upload buffers right before we use them on the gpu. Note that we need to check that the object is not active or pinned, for otherwise we create havoc on the active/pinned lists, which also use obj->mm_list. The active lists are sorted by and evicted in last GPU rendering order, access by the CPU to a still active buffer therefore does not affect its eviction ordering. Pinned objects are currently excluded from eviction, therefore the only list that we need to bump for GTT access by the CPU is the inactive list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Added further explanations to the commit message as discussed on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:10 +02:00
Daniel Vetter	6b26c86d61	drm/i915: create macros to handle masked bits ... and put them to so good use. Note that there's functional change in vlv clock gating code, we now no longer spuriously read back the current value of the bit. According to Bspec the high bits should always read zero, so ORing this in should have no effect. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:08 +02:00
Chris Wilson	5d82e3e642	drm/i915: Clarify the semantics of tiling_changed Rename obj->tiling_changed to obj->fence_dirty so that it is clear that it flags when the parameters for an active fence (including the no-fence) register are changed. Also, do not set this flag when the object does not have a fence register allocated currently and the gpu does not depend upon the unfence. This case works exactly like when a tiled object lost its fence and hence does not need additional handling for the tiling change in the code. v2: Use fence_dirty to better express what the flag tracks and add a few more details to the comments to serve as a reminder of how the GPU also uses the unfenced register slot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: Add some bikeshed to the commit message about the stricter use of fence_dirty.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:06 +02:00
Ben Widawsky	4f0c7cfbb4	drm/i915: [sparse] __iomem fixes for gem As with one of the earlier patches in the series, we're forced to cast for copy_[to\|from]_user. Again because of the nature of the GEN x86 exclusivity, this should be safe. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> [danvet: Added some bikeshed.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-05-03 11:18:01 +02:00
Linus Torvalds	6be5ceb02e	VM: add "vm_mmap()" helper function This continues the theme started with vm_brk() and vm_munmap(): vm_mmap() does the same thing as do_mmap(), but additionally does the required VM locking. This uninlines (and rewrites it to be clearer) do_mmap(), which sadly duplicates it in mm/mmap.c and mm/nommu.c. But that way we don't have to export our internal do_mmap_pgoff() function. Some day we hopefully don't have to export do_mmap() either, if all modular users can become the simpler vm_mmap() instead. We're actually very close to that already, with the notable exception of the (broken) use in i810, and a couple of stragglers in binfmt_elf. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-20 17:29:13 -07:00
Chris Wilson	14415745b2	drm/i915: Refactor get_fence() to use the common fence writing routine We can also take advantage of the new 'no retire' mode for seqno waiting to avoid having to take a reference on the old fence object whilst flushing an existing fence. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:40:51 +02:00
Chris Wilson	ada726c734	drm/i915: Refactor fence clearing to use the common fence writing routine Now that we have a routine that is able to clear the fences as well as setup up the register for a tiled object, remove the surplus routines to clear the fences. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:34:53 +02:00
Chris Wilson	61050808bb	drm/i915: Refactor put_fence() to use the common fence writing routine One clarification that we make is to the existing semantics of obj->tiling_changed to only mean that we need to update an associated fence register (including the NO_FENCE when executing an untiled but fenced GPU command). If we do not have a fence register or pending fenced GPU access for the object (after put_fence() for example), then we can clear the tiling_changed flag as any fence will necessarily be rewritten upon acquisition. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:34:30 +02:00
Chris Wilson	9ce079e481	drm/i915: Prepare to consolidate fence writing Update the existing architecture specific fence writing routines to either update the fence to point to a tiled object or to clear them in preparation to remove the other fence writing routes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:24:32 +02:00
Chris Wilson	1899184547	drm/i915: Remove the unsightly "optimisation" from flush_fence() As i915_wait_request() will first check for an already passed seqno, doing it also in the caller is a waste of space for a cold path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:23:17 +02:00
Chris Wilson	8fe301add5	drm/i915: Simplify fence finding As the fences are stored in LRU order, we can simply reuse the oldest if we do not have an unused register. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:20:35 +02:00
Chris Wilson	1c293ea3b1	drm/i915: Discard the unused obj->last_fenced_ring As we now never pipeline a fence update, obj->last_fenced_ring is always the same as the obj->ring whenever obj->last_fenced_seqno is active, so remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:19:51 +02:00
Chris Wilson	69963e7c76	drm/i915: Remove unused ring->setup_seqno As we now no longer track a pipelined fence change, we never use ring->setup_seqno and can kill it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:18:52 +02:00
Chris Wilson	a360bb1a83	drm/i915: Remove fence pipelining Step 2 is then to replace the pipelined parameter with NULL and perform constant folding to remove dead code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:18:25 +02:00
Chris Wilson	06d9813157	drm/i915: Remove the pipelined parameter from get_fence() We never succeeded in getting pipelined fencing to work (unresolved spurious GPU hangs), so begin the process of dismantling and removal the broken code. Step 1 is the removal of the pipeline parameter to get_fence(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-18 13:15:43 +02:00
Daniel Vetter	48ecfa1090	drm/i915: properly set ppgtt cacheability on snb For some reason snb has 2 fields to set ppgtt cacheability. This one here does not exist on gen7. This might explain why ppgtt wasn't a win on snb like on ivb - not enough pte caching. v2: Fixup rebase fail. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-17 11:19:59 +02:00
Daniel Vetter	be901a5a1b	drm/i915: set w/a bit for snb pagefaults Bspec says that we need to set this: vol1c.3 "Blitter Command Streamer", Section 1.1.2.1 "GAB_CTL_REG - GAB Unit Control Register". We don't really rely on pagefaults, but who knows what this all affects. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-17 11:19:56 +02:00
Daniel Vetter	767878908e	Linux 3.4-rc3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEbBAABAgAGBQJPi3XOAAoJEHm+PkMAQRiGnsUH9RjHwH4YFVyuP/DKtKa6zs74 wqkpT15yITQ5WWMog4JaJFFg5rJCUd8QZr7AS/HSn0ijDyZX5VU7Rcs9cMudDzNR H/5K/AscS4fjb0HwWVqoltTWHRb9QGSwVN3+E3VCDLt9P89YJ0o3QztkkuEX5dkZ jc7reVXTfRnCcILEa9jleOzrn+OLM3j/jAjQ2hGunl8EDLzD4b17HHPoli4jEZ/5 5ibpSVsPD+AqzN+glbXvYjVItl12D0IQos/JdOwfuZriCVWLxysSSwHZTbPCyvBZ LHH4HR5T+XLSXbjJeNkUFHLzqU+d5gVRadIoWtJCxqxFjKbOs2YtzJ5Ai0nDiw== =kTkC -----END PGP SIGNATURE----- Merge tag 'v3.4-rc3' into drm-intel-next-queued Backmerge Linux 3.4-rc3 into drm-intel-next to resolve a few things that conflict/depend upon patches in -rc3: - Second part of the Sandybridge workaround series - it changes some of the same registers. - Preparation for Chris Wilson's fencing cleanup - we need the fix from -rc3 merged before we can move around all that code. - Resolve the gmbus conflict - gmbus has been disabled in 3.4 again, but should be enabled on all generations in 3.5. Conflicts: drivers/gpu/drm/i915/intel_i2c.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-17 11:16:20 +02:00
Daniel Vetter	c07496fa61	drm/i915: don't pwrite tiled objects through the gtt ... we will botch up the bit17 swizzling. Furthermore tiled pwrite is a (now) unused slowpath, so no one really cares. This fixes the last swizzling issues I have with i-g-t on my bit17 swizzling i915G. No regression, it's been broken since the dawn of gem, but it's nice for regression tracking when really _all_ i-g-t tests work. Actually this is not true, Chris Wilson noticed while reviewing this patch that the commit commit `d9e86c0ee6` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Nov 10 16:40:20 2010 +0000 drm/i915: Pipelined fencing [infrastructure] contained a functional change that broke things. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-15 19:37:42 +02:00
Ben Widawsky	1500f7ea06	drm/i915: hide (seqno-1) in ringbuffer code Waiting for seqno-1 in our object synchronization code is an implementation detail given how we've decided to do the waits within the rest of our code. Requested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:14 +02:00
Ben Widawsky	e3a5a2250a	drm/i915: fix for when semaphore updates fail This fixes a long standing issue where emitting the semaphore updates may have failed, but we've already updated our internal data structure. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:13 +02:00
Ben Widawsky	5816d648d5	drm/i915: i915_gem_object_sync must handle NULL When I extracted the synchronization code for implementing semaphorified pageflips (74f5f6e0), I neglected the non pipelined case which also calls this code. The modesetting code wants to make sure the object has finished rendering to the frame before configuring the scanout (ie. non-pipelined case). As a result of a follow on discussion on IRC, I've decided to add a comment about the function itself which received much inspiration from Chris as well. So really, this patch was ghost-written by Chris :). Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:13 +02:00
Chris Wilson	f84131905b	drm/i915: Allow concurrent read access between CPU and GPU domain Similar to allowing a buffer to be simultaneously read by the GPU and through the GTT, we wish to allow readback of the pages through the CPU domain whilst they are also being read by the GPU. Domain coherency is managed by allowing multiple readers, but only a single writer. This is used by mesa for its program cache which it may search for every new program every frame and then renews should it need to add. During renewal, mesa copies the program bo currently executing through a CPU mapping onto the new bo. This patch allows the search and that copy to proceed without causing a stall on the current batch. Testcase: i-g-t/tests/gem_cpu_concurrent_blit Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:10 +02:00
Ben Widawsky	2911a35b2e	drm/i915: use semaphores for the display plane In theory this will have performance and power improvements. Performance because we don't need to stall when the scanout BO is busy, and power because we don't have to stall when the BO is busy (and the ring can even go to sleep if the HW supports it). v2: squash 2 patches into 1 (me) un-inline the enable_semaphores function (Daniel) remove comment about SNB hangs from i915_gem_object_sync (Chris) rename intel_enable_semaphores to i915_semaphore_is_enabled (me) removed page flip comment; "no why" (Chris) To address other comments from Daniel (irc): update the comment to say 'vt-d is crap, don't enable semaphores' - I think you misinterpreted Chris' comment, it already exists. checking out whether we can pageflip on the render ring on ivb (didn't work on early silicon) - We don't want to enable workarounds for early silicon unless we have to. - I can't find any references in the docs about this. optionally use it if the fb is already busy on the render ring - This should be how the code already worked, unless I am misunderstanding your meaning. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:05 +02:00
Chris Wilson	9a5a53b392	drm/i915: Reorganise rules for get_fence/put_fence By simplifying the rules to calling get_fence when writing to the through the GTT in a tiled manner, and calling put_fence before writing to the object through the GTT in a linear manner, the code becomes clearer and there is less chance of making a mistake. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: fixed up conflict with ppgtt code and spelling in a new comment.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:04 +02:00
Dave Airlie	effbc4fd8e	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next Daniel Vetter wrote First pull request for 3.5-next, slightly large than usual because new things kept coming in since the last pull for 3.4. Highlights: - first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci ids are not yet added, and there's still quite a few patches to merge (mostly modesetting). To make QA easier I've decided to merge this stuff in pieces. - loads of cleanups and prep patches spurred by the above. Especially vlv is a real frankenstein chip, but also hsw is stretching our driver's code design. Expect more to come in this area for 3.5. - more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again, there are more patches needed (and some already queued up), but I wanted to split this a bit for better testing. - pwrite/pread rework and retuning. This series has been in the works for a few months already and a lot of i-g-t tests have been created for it. Now it's finally ready to be merged. Note that one patch in this series touches include/pagemap.h, that patch is acked-by akpm. - reduce mappable pressure and relocation throughput improvements from Chris. - mmap offset exhaustion mitigation by Chris Wilson. - a start at figuring out which codepaths in our messy dri1/ums+gem/kms driver we actually need to support by bailing out of unsupported case. The driver now refuses to load without kms on gen6+ and disallows a few ioctls that userspace never used in certain cases. More of this will definitely come. - More decoupling of global gtt and ppgtt. - Improved dual-link lvds detection by Takashi Iwai. - Shut up the compiler + plus fix the fallout (Ben) - Inverted panel brightness handling (mostly Acer manages to break things in this way). - Small fixlets and adjustements and some minor things to help debugging. Regression-wise QA reported quite a few issues on ivb, but all of them turned out to be hw stability issues which are already fixed in drm-intel-fixes (QA runs the nightly regression tests on -next alone, without -fixes automatically merged in). There's still one issue open on snb, it looks like occlusion query writes are not quite as cache coherent as we've expected. With some of the pwrite adjustements we can now reliably hit this. Kernel workaround for it is in the works." * 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits) drm/i915: VCS is not the last ring drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2 drm/i915: make quirks more verbose drm/i915: dump the DMA fetch addr register on pre-gen6 drm/i915/sdvo: Include YRPB as an additional TV output type drm/i915: disallow gem init ioctl on ilk drm/i915: refuse to load on gen6+ without kms drm/i915: extract gt interrupt handler drm/i915: use render gen to switch ring irq functions drm/i915: rip out old HWSTAM missed irq WA for vlv drm/i915: open code gen6+ ring irqs drm/i915: ring irq cleanups drm/i915: add SFUSE_STRAP registers for digital port detection drm/i915: add WM_LINETIME registers drm/i915: add WRPLL clocks drm/i915: add LCPLL control registers drm/i915: add SSC offsets for SBI access drm/i915: add port clock selection support for HSW drm/i915: add S PLL control drm/i915: add PIXCLK_GATE register ... Conflicts: drivers/char/agp/intel-agp.h drivers/char/agp/intel-gtt.c drivers/gpu/drm/i915/i915_debugfs.c	2012-04-12 10:27:01 +01:00
Daniel Vetter	15a13bbdff	drm/i915: clear fencing tracking state when retiring requests This fixes a resume regression introduced in commit `7dd4906586` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Mar 21 10:48:18 2012 +0000 drm/i915: Mark untiled BLT commands as fenced on gen2/3 which fixed fencing tracking for untiled blt commands. A side effect of that patch was that now also untiled objects have a non-zero obj->last_fenced_seqno to track when a fence can be set up after a pipelined tiling change. Unfortunately this was only cleared by the fence setup and teardown code, resulting in tons of untiled but inactive objects with non-zero last_fenced_seqno. Now after resume we completely reset the seqno tracking, both on the driver side (by setting dev_priv->next_seqno = 1) and on the hw side (by allocating a new hws page, which contains the seqnos). Hilarity and indefinite waits ensued from the stale seqnos in obj->last_fenced_seqno from before the suspend. The fix is to properly clear the fencing tracking state like we already do for the normal gpu rendering while moving objects off the active list. Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jiri Slaby <jslaby@suse.cz> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 09:02:37 +02:00
Daniel Vetter	f534bc0b22	drm/i915: disallow gem init ioctl on ilk Ums is already disabled, but on ilk we can additionally disable gem initialization when using user mode setting. Upstream never support ilk without kernel modesetting and not even the RHEL ilk ums backport needs gem - that driver is based on xf86-video-intel version 2.2, which is pre-gem. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-09 18:04:08 +02:00
Chris Wilson	7dd4906586	drm/i915: Mark untiled BLT commands as fenced on gen2/3 The BLT commands on gen2/3 utilize the fence registers and so we cannot modify any fences for the object whilst those commands are in flight. Currently we marked tiled commands as occupying a fence, but forgot to restrict the untiled commands from preventing a fence being assigned before they were completed. One side-effect is that we ten have to double check that a fence was allocated for a fenced buffer during move-to-active. Reported-by: Jiri Slaby <jirislaby@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43427 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47990 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Testcase: i-g-t/tests/gem_tiled_after_untiled_blt Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-01 12:26:05 +02:00
Daniel Vetter	55a254ac63	drm/i915: properly restore the ppgtt page directory on resume The ppgtt page directory lives in a snatched part of the gtt pte range. Which naturally gets cleared on hibernate when we pull the power. Suspend to ram (which is what I've tested) works because despite the fact that this is a mmio region, it is actually back by system ram. Fix this by moving the page directory setup code to the ppgtt init code (which gets called on resume). This fixes hibernate on my ivb and snb. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-01 12:25:29 +02:00
Jesse Barnes	23e3f9b37e	drm/i915: check for disabled interrupts on ValleyView Haven't seen this yet, but it doesn't hurt. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-29 00:11:46 +02:00
Daniel Vetter	e7e58eb5c0	drm/i915: mark pwrite/pread slowpaths with unlikely Beside helping the compiler untangle this maze they double-up as documentation for which parts of the code aren't performance-critical but just around to keep old (but already dead-slow) userspace from breaking. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:41:41 +02:00
Daniel Vetter	23c18c71da	drm/i915: fixup in-line clflushing on bit17 swizzled bos The issue is that with inline clflushing the clflushing isn't properly swizzled. Fix this by - always clflushing entire 128 byte chunks and - unconditionally flush before writes when swizzling a given page. We could be clever and check whether we pwrite a partial 128 byte chunk instead of a partial cacheline, but I've figured that's not worth it. Now the usual approach is to fold this into the original patch series, but I've opted against this because - this fixes a corner case only very old userspace relies on and - I'd like to not invalidate all the testing the pwrite rewrite has gotten. This fixes the regression notice by tests/gem_tiled_partial_prite_pread from i-g-t. Unfortunately it doesn't fix the issues with partial pwrites to tiled buffers on bit17 swizzling machines. But that is also broken without the pwrite patches, so likely a different issue (or a problem with the testcase). v2: Simplify the patch by dropping the overly clever partial write logic for swizzled pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:40:57 +02:00
Daniel Vetter	f56f821feb	mm: extend prefault helpers to fault in more than PAGE_SIZE drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. Add new functions in filemap.h to make that possible. Also kill a copy&pasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. v3: Becaus I couldn't find a way around adding a uaddr += PAGE_SIZE to the filemap.c hotpaths (that the compiler couldn't remove again), let's go with separate new functions for the multipage use-case. v4: Adjust comment to CodingStlye and fix spelling. Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:36:30 +02:00
Daniel Vetter	d174bd6472	drm/i915: extract copy helpers from shmem_pread\|pwrite While moving around things, this two functions slowly grew out of any sane bounds. So extract a few lines that do the copying and clflushing. Also add a few comments to explain what's going on. v2: Again do s/needs_clflush/needs_clflush_after/ in the write paths as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:30:33 +02:00
Daniel Vetter	117babcdd5	drm/i915: use uncached writes in pwrite It's around 20% faster. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:29:38 +02:00
Daniel Vetter	ffc62976d2	drm/i915: fall back to shmem pwrite when the buffer is not accessible It's too expensive to move it around just for that pwrite, especially when we're trashing on the mappable gtt part like crazy. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:29:08 +02:00
Daniel Vetter	586428852a	drm/i915: implement inline clflush for pwrite In micro-benchmarking of the usual pwrite use-pattern of alternating pwrites with gtt domain reads from the gpu, this yields around 30% improvement of pwrite throughput across all buffers size. The trick is that we can avoid clflush cachelines that we will overwrite completely anyway. Furthermore for partial pwrites it gives a proportional speedup on top of the 30% percent because we only clflush back the part of the buffer we're actually writing. v2: Simplify the clflush-before-write logic, as suggested by Chris Wilson. v3: Finishing touches suggested by Chris Wilson: - add comment to needs_clflush_before and only set this if the bo is uncached. - s/needs_clflush/needs_clflush_after/ in the write paths to clearly differentiate it from needs_clflush_before. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:28:45 +02:00
Daniel Vetter	96d79b5270	drm/i915: don't clobber userspace memory before commiting to the pread The pagemap.h prefault helpers do the prefaulting by simply writing some data into every page. Hence we should not prefault when we're not yet commited to to actually writing data to userspace. The problem is now that - we can't prefault while holding dev->struct_mutex for we could deadlock with our own pagefault handler - we need to grab dev->struct_mutex before copying to sync up with any outsanding gpu writes. Therefore only prefault when we're dropping the lock the first time in the pread slowpath - at that point we're committed to the write, don't wait on the gpu anymore and hence won't return early (with e.g. -EINTR). Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:28:32 +02:00
Daniel Vetter	935aaa692e	drm/i915: drop gtt slowpath With the proper prefault, it's extremely unlikely that we fall back to the gtt slowpath. So just kill it and use the shmem_pwrite path as fallback. To further clean up the code, move the preparatory gem calls into the respective pwrite functions. This way the gtt_fast->shmem fallback is much more obvious. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:27:21 +02:00
Daniel Vetter	692a576b9d	drm/i915: don't call shmem_read_mapping unnecessarily This speeds up pwrite and pread from ~120 µs ro ~100 µs for reading/writing 1mb on my snb (if the backing storage pages are already pinned, of course). v2: Chris Wilson pointed out a glaring page reference bug - I've unconditionally dropped the reference. With that fixed (and the associated reduction of dirt in dmesg) it's now even a notch faster. v3: Unconditionaly grab a page reference when dropping dev->struct_mutex to simplify the code-flow. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:27:03 +02:00
Daniel Vetter	3ae5378330	drm/i915: don't use gtt_pwrite on LLC cached objects ~120 µs instead fo ~210 µs to write 1mb on my snb. I like this. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:25:45 +02:00
Daniel Vetter	a0356fc373	drm/i915: kill ranged cpu read domain support No longer needed. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:25:32 +02:00
Daniel Vetter	8489731c9b	drm/i915: move clflushing into shmem_pread This is obviously gonna slow down pread. But for a half-way realistic micro-benchmark, it doesn't matter: Non-broken userspace reads back data from the gpu once before the gpu again dirties it. So all this ranged clflush tracking is just a waste of time. No pread performance change (neglecting the dumb benchmark of constantly reading the same data) measured. As an added bonus, this avoids clflush on read on coherent objects. Which means that partial preads on snb are now roughly 4x as fast. This will be usefull for e.g. the libva encoder - when I finally get around to fix that up. v2: Properly sync with the gpu on LLC machines. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:20:01 +02:00
Daniel Vetter	dbf7bff074	drm/i915: merge shmem_pread slow&fast-path With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:19:11 +02:00
Daniel Vetter	e244a443bf	drm/i915: merge shmem_pwrite slow&fast-path With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:18:58 +02:00
Chris Wilson	dabdfe021a	drm/i915: Avoid using mappable space for relocation processing through the CPU We try to avoid writing the relocations through the uncached GTT, if the buffer is currently in the CPU write domain and so will be flushed out to main memory afterwards anyway. Also on SandyBridge we can safely write to the pages in cacheable memory, so long as the buffer is LLC mapped. In either of these cases, we therefore do not need to force the reallocation of the buffer into the mappable region of the GTT, reducing the aperture pressure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:16:17 +02:00
Daniel Vetter	644ec02b5d	drm/i915: s/i915_gem_do_init/i915_gem_init_global_gtt ... because this is what it actually doesn now that we have the global gtt vs. ppgtt split. Also move it to the other global gtt functions in i915_gem_gtt.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:14:59 +02:00
Chris Wilson	a14917eeb2	drm/i915: Release the mmap offset when purging a buffer If we discard a buffer due to memory pressure, also release its alloted mmap address space. As it may be sometime before userspace wakes up and notices that it has buffers to purge from its cache, we may waste valuable address space on unusable objects for a period of time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47738 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-23 11:04:35 +01:00
Ben Widawsky	eb2c0c818a	drm/i915: [dinq] shut up two instances -Wunitialized Introduced in commit `8461d226` and `8c59967c` Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: s/fix/shut up/ in the commit msg.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-22 17:44:09 +01:00
Daniel Vetter	0ebb982993	drm/i915: enable lazy global-gtt binding Now that everything is in place, only bind to the global gtt when actually required. Patch split-up suggested by Chris Wilson. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:55:16 +01:00
Daniel Vetter	74898d7edc	drm/i915: bind objects to the global gtt only when needed And track the existence of such a binding similar to the aliasing ppgtt case. Speeds up binding/unbinding in the common case where we only need a ppgtt binding (which is accessed in a cpu coherent fashion by the gpu) and no gloabl gtt binding (which needs uc writes for the ptes). This patch just puts the required tracking in place. v2: Check that global gtt mappings exist in the error_state capture code (with Chris Wilson's llc reloc patches batchbuffers are no longer relocated as mappable in all situations, so this matters). Suggested by Chris Wilson. v3: Adapted to Chris' latest llc-reloc patches. v4: Fix a bug in the i915 error state capture code noticed by Chris Wilson. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:52:01 +01:00
Daniel Vetter	741639079c	drm/i915: split out dma mapping from global gtt bind/unbind functions Note that there's a functional change buried in this patch wrt the ilk dmar workaround: We now only idle the gpu while tearing down the dmar mappings, not while clearing the gtt. Keeping the current semantics would have made for some really ugly code and afaik the issue is only with the dmar unmapping that needs a fully idle gpu. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:51:41 +01:00
Chris Wilson	c501ae7f33	drm/i915: Only clear the GPU domains upon a successful finish By clearing the GPU read domains before waiting upon the buffer, we run the risk of the wait being interrupted and the domains prematurely cleared. The next time we attempt to wait upon the buffer (after userspace handles the signal), we believe that the buffer is idle and so skip the wait. There are a number of bugs across all generations which show signs of an overly haste reuse of active buffers. Such as: https://bugs.freedesktop.org/show_bug.cgi?id=29046 https://bugs.freedesktop.org/show_bug.cgi?id=35863 https://bugs.freedesktop.org/show_bug.cgi?id=38952 https://bugs.freedesktop.org/show_bug.cgi?id=40282 https://bugs.freedesktop.org/show_bug.cgi?id=41098 https://bugs.freedesktop.org/show_bug.cgi?id=41102 https://bugs.freedesktop.org/show_bug.cgi?id=41284 https://bugs.freedesktop.org/show_bug.cgi?id=42141 A couple of those pre-date i915_gem_object_finish_gpu(), so may be unrelated (such as a wild write from a userspace command buffer), but this does look like a convincing cause for most of those bugs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-01 21:36:13 +01:00
Chris Wilson	eadb29a9c5	drm/i915: Silence the error message from i915_wait_request() This error message has since been superseded by the hangcheck, and does not add any salient information beyond that already printed by hangcheck discovering the GPU hang that lead to i915_wait_request() bombing out in the first place. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-27 18:08:22 +01:00
Chris Wilson	a71d8d9452	drm/i915: Record the tail at each request and use it to estimate the head By recording the location of every request in the ringbuffer, we know that in order to retire the request the GPU must have finished reading it and so the GPU head is now beyond the tail of the request. We can therefore provide a conservative estimate of where the GPU is reading from in order to avoid having to read back the ring buffer registers when polling for space upon starting a new write into the ringbuffer. A secondary effect is that this allows us to convert intel_ring_buffer_wait() to use i915_wait_request() and so consolidate upon the single function to handle the complicated task of waiting upon the GPU. A necessary precaution is that we need to make that wait uninterruptible to match the existing conditions as all the callers of intel_ring_begin() have not been audited to handle ERESTARTSYS correctly. By using a conservative estimate for the head, and always processing all outstanding requests first, we prevent a race condition between using the estimate and direct reads of I915_RING_HEAD which could result in the value of the head going backwards, and the tail overflowing once again. We are also careful to mark any request that we skip over in order to free space in ring as consumed which provides a self-consistency check. Given sufficient abuse, such as a set of unthrottled GPU bound cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on Sandy Bridge (i5-2520m): firefox-paintball 18927ms -> 15646ms: 1.21x speedup firefox-fishtank 12563ms -> 11278ms: 1.11x speedup which is a mild consolation for the performance those traces achieved from exploiting the buggy autoreported head. v2: Add a few more comments and make request->tail a conservative estimate as suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: resolve conflicts with retirement defering and the lack of the autoreport head removal (that will go in through -fixes).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-15 14:26:03 +01:00
Daniel Vetter	53d227f282	drm/i915: fixup seqno allocation logic for lazy_request Currently we reserve seqnos only when we emit the request to the ring (by bumping dev_priv->next_seqno), but start using it much earlier for ring->oustanding_lazy_request. When 2 threads compete for the gpu and run on two different rings (e.g. ddx on blitter vs. compositor) hilarity ensued, especially when we get constantly interrupted while reserving buffers. Breakage seems to have been introduced in commit `6f392d5486` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Aug 7 11:01:22 2010 +0100 drm/i915: Use a common seqno for all rings. This patch fixes up the seqno reservation logic by moving it into i915_gem_next_request_seqno. The ring->add_request functions now superflously still return the new seqno through a pointer, that will be refactored in the next patch. Note that with this change we now unconditionally allocate a seqno, even when ->add_request might fail because the rings are full and the gpu died. But this does not open up a new can of worms because we can already leave behind an outstanding_request_seqno if e.g. the caller gets interrupted with a signal while stalling for the gpu in the eviciton paths. And with the bugfix we only ever have one seqno allocated per ring (and only that ring), so there are no ordering issues with multiple outstanding seqnos on the same ring. v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it clear that we only have one seqno counter for all rings. Suggested by Chris Wilson. v3: As suggested by Chris Wilson use i915_gem_next_request_seqno instead of ring->oustanding_lazy_request to make the follow-up refactoring more clearly correct. Also improve the commit message with issues discussed on irc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:57 +01:00
Daniel Vetter	5391d0cffe	drm/i915: outstanding_lazy_request is a u32 So don't assign it false, that's just confusing ... No functional change here. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:48 +01:00
Daniel Vetter	e21af88d39	drm/i915: enable ppgtt We want to unconditionally enable ppgtt for two reasons: - Windows uses this on snb and later. - We need the basic hw support to work before we can think about real per-process address spaces and other cool features we want. But Chris Wilson was complaining all over irc and intel-gfx that this will blow up if we don't have a module option to disable it. Hence add one, to prevent this. ppgtt support seems to slightly change the timings and make crashy things slightly more or less crashy. Now in my testing and the testing this got on troublesome snb machines, it seems to have improved things only. But on ivb it makes quite a few crashes happen much more often, see https://bugs.freedesktop.org/show_bug.cgi?id=41353 Luckily Eugeni Dodonov seems to have a set of workarounds that fix this issue. v2: Don't try to enable ppgtt on pre-snb. v3: Pimp commit message and make Chris Wilson less grumpy by adding a module option. v4: New try at making Chris Wilson happy. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:49:30 +01:00
Daniel Vetter	7bddb01fb9	drm/i915: ppgtt binding/unbinding support This adds support to bind/unbind objects and wires it up. Objects are only put into the ppgtt when necessary, i.e. at execbuf time. Objects are still unconditionally put into the global gtt. v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind like for the global gtt function. Noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:25:23 +01:00
Daniel Vetter	11782b0233	drm/i915: consolidate swizzling control bit frobbing On gen5 we also need to correctly set up swizzling in the display scanout engine, but only there. Consolidate this into the same function. This has a small effect on ums setups - the kernel now also sets this bit in addition to userspace setting it. Given that this code only runs when userspace either can't (resume, gpu reset) or explicitly won't(gem_init) touch the hw this shouldn't have an adverse effect. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:18:27 +01:00
Daniel Vetter	f691e2f4ce	drm/i915: swizzling support for snb/ivb We have to do this manually. Somebody had a Great Idea. I've measured speed-ups just a few percent above the noise level (below 5% for the best case), but no slowdows. Chris Wilson measured quite a bit more (10-20% above the usual snb variance) on a more recent and better tuned version of sna, but also recorded a few slow-downs on benchmarks know for uglier amounts of snb-induced variance. v2: Incorporate Ben Widawsky's preliminary review comments and elaborate a bit about the performance impact in the changelog. v3: Add a comment as to why we don't need to check the 3rd memory channel. v4: Fixup whitespace. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:16:24 +01:00
Daniel Vetter	8461d22677	drm/i915: rewrite shmem_pread_slow to use copy_to_user Like for shmem_pwrite_slow. The only difference is that because we read data, we can leave the fetched cachelines in the cpu: In the case that the object isn't in the cpu read domain anymore, the clflush for the next cpu read domain invalidation will simply drop these cachelines. slow_shmem_bit17_copy is now ununsed, so kill it. With this patch tests/gem_mmap_gtt now actually works. v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson. v3: Fixup the swizzling logic, it swizzled the wrong pages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:34 +01:00
Daniel Vetter	8c59967c44	drm/i915: rewrite shmem_pwrite_slow to use copy_from_user ... instead of get_user_pages, because that fails on non page-backed user addresses like e.g. a gtt mapping of a bo. To get there essentially copy the vfs read path into pagecache. We can't call that right away because we have to take care of bit17 swizzling. To not deadlock with our own pagefault handler we need to completely drop struct_mutex, reducing the atomicty-guarantees of our userspace abi. Implications for racing with other gem ioctl: - execbuf, pwrite, pread: Due to -EFAULT fallback to slow paths there's already the risk of the pwrite call not being atomic, no degration. - read/write access to mmaps: already fully racy, no degration. - set_tiling: Calling set_tiling while reading/writing is already pretty much undefined, now it just got a bit worse. set_tiling is only called by libdrm on unused/new bos, so no problem. - set_domain: When changing to the gtt domain while copying (without any read/write access, e.g. for synchronization), we might leave unflushed data in the cpu caches. The clflush_object at the end of pwrite_slow takes care of this problem. - truncating of purgeable objects: the shmem_read_mapping_page call could reinstate backing storage for truncated objects. The check at the end of pwrite_slow takes care of this. v2: - add missing intel_gtt_chipset_flush - add __ to copy_from_user_swizzled as suggest by Chris Wilson. v3: Fixup bit17 swizzling, it swizzled the wrong pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:21 +01:00
Daniel Vetter	5c0480f21f	drm/i915: fall through pwrite_gtt_slow to the shmem slow path The gtt_pwrite slowpath grabs the userspace memory with get_user_pages. This will not work for non-page backed memory, like a gtt mmapped gem object. Hence fall throuh to the shmem paths if we hit -EFAULT in the gtt paths. Now the shmem paths have exactly the same problem, but this way we only need to rearrange the code in one write path. v2: v1 accidentaly falls back to shmem pwrite for phys objects. Fixed. v3: Make the codeflow around phys_pwrite cleara as suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:07 +01:00
Chris Wilson	068c6ff1cb	drm/i915: Remove the upper limit on the bo size for mapping into the CPU domain The original intention of comparing the bo against the mappable GTT limits was to prevent a subsequent faulting of the bo into the GTT from clearing the entire GTT in vain. However, that was clearly a cut'n'paste mistake as a CPU mapping never binds the bo into the aperture. Whilst there may be some merit to limiting the maximum size of the bo to something that can be utilized by the GPU, that limit itself does not belong as a safeguard to mmapping the bo, so remove the check entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 17:54:35 +01:00
Daniel Vetter	39965b3766	drm/i915: don't trash the gtt when running out of fences With the fence accounting fixed up in the previous commit not finding enough fences is a fatal error and userspace bug. Trashing the entire gtt is not gonna turn up that missing fence, so don't to this by returning another error thatn ENOSPC. This has the added benefit that it's easier to distinguish fence accounting errors from gtt space accounting issues. TTM serves as precendence for the EDEADLK error code - it returns it when the reservation code needs resources already blocked by the current reservation. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:24:10 +01:00
Chris Wilson	1690e1eb7a	drm/i915: Separate fence pin counting from normal bind pin counting In order to correctly account for reserving space in the GTT and fences for a batch buffer, we need to independently track whether the fence is pinned due to a fenced GPU access in the batch or whether the buffer is pinned in the aperture. Currently we count the fenced as pinned if the buffer has already been seen in the execbuffer. This leads to a false accounting of available fence registers, causing frequent mass evictions. Worse, if coupled with the change to make i915_gem_object_get_fence() report EDADLK upon fence starvation, the batchbuffer can fail with only one fence required... Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Paul Neumann <paul104x@yahoo.de> [danvet: Resolve the functional conflict with Jesse Barnes sprite patches, acked by Chris Wilson on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:23:37 +01:00
Ben Widawsky	b93f9cf14e	drm/i915: argument to control retiring behavior Sometimes it may be the case when we idle the gpu or wait on something we don't actually want to process the retiring list. This patch allows callers to choose the behavior. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-26 11:19:19 +01:00
Eugeni Dodonov	3d29b842e5	drm/i915: add a LLC feature flag in device description LLC is not SNB/IVB-specific, so we should check for it in a more generic way. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-17 20:01:45 +01:00
Eric Anholt	e959b5db4a	drm/i915: Make the fallback IRQ wait not sleep. The waits we do here are generally so short that sleeping is a bad idea unless we have an IRQ to wake us up. Improves regression test performance from 18 minutes to 3.5 minutes on gen7, which is now consistent with the previous generation. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:16 -08:00
Eric Anholt	7ea29b13e5	drm/i915: Do the fallback non-IRQ wait in ring throttle, too. As a workaround for IRQ synchronization issues in the gen7 BLT ring, we want to turn the two wait functions into polling loops. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:14 -08:00
Linus Torvalds	ed4a51842a	Revert "drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a" This reverts commit `eb1711bb94`. It blows up the i915 seqno tracking, resulting in the BUG_ON(seqno == 0); in i915_wait_request() triggering, which will cause lock-ups. See for example https://bugs.launchpad.net/ubuntu/+source/linux/+bug/903010 https://lkml.org/lkml/2011/12/14/395 Reported-requested-and-tested-by: Dirk Hohndel <dirk@hohndel.org> Reported-by: Richard Eames <Richard.Eames@flinders.edu.au> Reported-by: Rocko Requin <rockorequin@hotmail.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-16 12:58:39 -08:00
Daniel Vetter	eb1711bb94	drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a The recursion loop goes retire_requests->unbind->gpu_idle->retire_reqeusts. Every time we go through this we need a - active object that can be retired - and there are no other references to that object than the one from the active list, so that it gets unbound and freed immediately. Otherwise the recursion stops. So the recursion is only limited by the number of objects that fit these requirements sitting in the active list any time retire_request is called. Issue exercised by tests/gem_unref_active_buffers from i-g-t. There's been a decent bikeshed discussion whether it wouldn't be better to pass around a flag, but imo this is o.k. for such a limited case that only supports a w/a. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson> [ickle- we built better bikesheds, but this keeps the rain off for now] Tested-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-12-07 10:44:40 +00:00
Rakib Mullick	457eafce61	drm, i915: Fix memory leak in i915_gem_busy_ioctl(). A call to i915_add_request() has been made in function i915_gem_busy_ioctl(). i915_add_request can fail, so in it's exit path previously allocated memory needs to be freed. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-17 12:57:45 -08:00
Jesse Barnes	680da876f4	drm/i915: enable cacheable objects on Ivybridge IVB supports these bits as well. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 16:17:57 -07:00
Daniel Vetter	4b9de737fa	drm/i915: add constants to size fence arrays and fields In preparation of to support 32 fences on Ivybdrigde. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 09:20:37 -07:00
Eric Anholt	ff56b0bc84	drm/i915: Fix object refcount leak on mmappable size limit error path. I've been seeing memory leaks on my system in the form of large (300-400MB) GEM objects created by now-dead processes laying around clogging up memory. I usually notice when it gets to about 1.2GB of them. Hopefully this clears up the issue, but I just found this bug by inspection. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-01 09:15:17 -07:00
Ben Widawsky	f372b85463	drm/i915: Remove early exit on i915_gpu_idle [Description from: Daniel Vetter] I've just discussed this quickly with Chris on irc and it's probably best to just kill the list_empty early bailout. gpu_idle isn't a fastpath, so who cares. One candidate where we emit commands to the ring without adding anything onto these lists is e.g. pageflip. There are probably more. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:38 -07:00
Daniel Vetter	130c2561de	drm/i915: drop KM_USER0 argument to k(un)map_atomic Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:37 -07:00
Chris Wilson	8ffc024681	drm/i915: Defend against userspace creating a gem object with size==0 We currently only round up the userspace size to the next page. We assume that userspace hasn't made a mistake and requested a zero-length gem object and all through our internal code we then presume that every object is backed by at least a single page. Fix that oversight and report EINVAL back to userspace if they try to create a zero length object. [danvet: This fixes tests/gem_bad_length] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:19 -07:00
Daniel Vetter	6dacfd2faa	drm/i915: simplify swapin/out swizzle checking a bit Use the helper function already employed by the pwrite/pread functions. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:18 -07:00
Dave Airlie	88ef4e3f4f	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux into drm-next * 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux: Drivers: i915: Fix all space related issues.	2011-09-20 09:36:22 +01:00
Akshay Joshi	0206e353a0	Drivers: i915: Fix all space related issues. Various issues involved with the space character were generating warnings in the checkpatch.pl file. This patch removes most of those warnings. Signed-off-by: Akshay Joshi <me@akshayjoshi.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-19 18:01:47 -07:00
Rob Clark	b464e9a25c	drm/i915: use common functions for mmap offset creation Signed-off-by: Rob Clark <rob@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-08-30 11:07:00 +01:00
Keith Packard	df7976797f	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-07-22 13:40:42 -07:00
Keith Packard	f0b69efc29	drm/i915: Skip GPU wait for scanout pin while wedged Failing to pin a scanout buffer will most likely lead to a black screen, so if the GPU is wedged, then just let the pin happen and hope that things work out OK. v2: Just ignore any error from i915_gem_object_wait_rendering, as suggested by Chris Wilson Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-21 20:18:31 -07:00
Chris Wilson	e28f871165	drm/i915: Fix unfenced alignment on pre-G33 hardware Align unfenced buffers on older hardware to the power-of-two object size. The docs suggest that it should be possible to align only to a power-of-two tile height, but using the already computed fence size is easier and always correct. We also have to make sure that we unbind misaligned buffers upon tiling changes. In order to prevent a repetition of this bug, we change the interface to the alignment computation routines to force the caller to provide the requested alignment and size of the GTT binding rather than assume the current values on the object. Reported-and-tested-by: Sitosfe Wheeler <sitsofe@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36326 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-18 14:02:06 -07:00
Keith Packard	8eb2c0ee67	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-29 10:34:54 -07:00
Ben Widawsky	3e0dc6b01f	drm/i915: hangcheck disable parameter Provide a parameter to disable hanghcheck. This is useful mostly for developers trying to debug known problems, and probably should not be touched by normal users. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-29 10:32:08 -07:00
Linus Torvalds	0d72c6fcb5	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6 * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: drm/i915: Use chipset-specific irq installers drm/i915: forcewake fix after reset drm/i915: add Ivy Bridge page flip support drm/i915: split page flip queueing into per-chipset functions	2011-06-28 11:15:57 -07:00
Keith Packard	6ae77e6b6a	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-28 10:29:47 -07:00
Chris Wilson	f01c22fd59	drm/i915: Use chipset-specific irq installers Konstantin Belousov pointed out that `4697995b98` replaced the generic i915_driver_irq_install() functions with chipset specific routines accessible only through driver->irq_install(). So update the sanity check in i915_request_wait() to match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-28 10:20:06 -07:00
Hugh Dickins	e2377fe0b6	drm/i915: use shmem_truncate_range The interface to ->truncate_range is changing very slightly: once "tmpfs: take control of its truncate_range" has been applied, this can be applied. For now there is only a slight inefficiency while this remains unapplied, but it will soon become essential for managing shmem's use of swap. Change i915_gem_object_truncate() to use shmem_truncate_range() directly: which should also spare i915 later change if we switch from inode_operations->truncate_range to file_operations->fallocate. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:14 -07:00
Hugh Dickins	5949eac4d9	drm/i915: use shmem_read_mapping_page Soon tmpfs will stop supporting ->readpage and read_cache_page_gfp(): once "tmpfs: add shmem_read_mapping_page_gfp" has been applied, this patch can be applied to ease the transition. Make i915_gem_object_get_pages_gtt() use shmem_read_mapping_page_gfp() in the one place it's needed; elsewhere use shmem_read_mapping_page(), with the mapping's gfp_mask properly initialized. Forget about __GFP_COLD: since tmpfs initializes its pages with memset, asking for a cold page is counter-productive. Include linux/shmem_fs.h also in drm_gem.c: with shmem_file_setup() now declared there too, we shall remove the prototype from linux/mm.h later. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:13 -07:00
Keith Packard	b97c3d9c16	drm/i915: i915_gem_object_finish_gtt must always release gtt mmap Even if the object is no longer in the GTT domain, there may still be a user space mapping which needs to be released. Without this fix, render-based text (mostly in firefox) would occasionally get corrupted when the system was under load. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-24 21:02:59 -07:00
Keith Packard	2cd1176bd9	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-21 12:02:57 -07:00
Eric Anholt	e92d03bff9	Revert "drm/i915: Kill GTT mappings when moving from GTT domain" This reverts commit `4a684a4117`. Userland has always been required to set the object's domain to GTT before using it through a GTT mapping, it's not something that the kernel is supposed to enforce. (The pagefault support is so that we can handle multiple mappings without userland having to pin across them, not so that userland can use GTT after GPU domains without telling the kernel). Fixes 19.2% +/- 0.8% (n=6) performance regression in cairo-gl firefox-talos-gfx on my T420 latop. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-21 11:11:02 -07:00
Jesper Juhl	b65552f06c	drm/i915: Don't leak in i915_gem_shmem_pread_slow() It seems to me that we are leaking 'user_pages' in drivers/gpu/drm/i915/i915_gem.c::i915_gem_shmem_pread_slow() if read_cache_page_gfp() fails. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-06-14 11:00:54 +10:00
Eric Anholt	a187111207	drm/i915: Use the LLC mode on gen6 for everything but display. Improves full-screen openarena on my laptop 20.3% +/- 4.0% (n=3) Improves 800x600 nexuiz on my laptop 12.3% +/- 0.1% (n=3) We have more room to improve with doing LLC caching for display using GFDT, and in doing LLC+MLC caching, but this was an easy performance win and incremental improvement toward those two. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:22 -07:00
Eric Anholt	a7ef0640d9	drm/i915: Use the uncached domain for the display planes The simplest and common method for ensuring scanout coherency on all chipsets is to mark the scanout buffers as uncached (and for userspace to remember to flush the render cache every so often). We can improve upon this for later generations by marking scanout objects as GFDT and only flush those cachelines when required. However, we start simple. [v2: Move the set to uncached above the clflush. Otherwise, we'd skip the clflush and try to scan out data that was still sitting in the cache.] Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:20 -07:00
Chris Wilson	2da3b9b940	drm/i915: Combine pinning with setting to the display plane We need to perform a few operations in order to move the object into the display plane (where it can be accessed coherently by the display engine) that are important for future safety to forbid whilst pinned. As a result, we want to need to perform some of the operations before pinning, but some are required once we have been bound into the GTT. So combine the pinning performed by all the callers with set_to_display_plane(), so this complication is contained within the single function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:19 -07:00
Chris Wilson	e4ffd173a1	drm/i915: Add an interface to dynamically change the cache level [anholt v2: Don't forget that when going from cached to uncached, we haven't been tracking the write domain from the CPU perspective, since we haven't needed it for GPU coherency.] [ickle v3: We also need to make sure we relinquish any fences on older chipsets and clear the GTT for sane domain tracking.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:16 -07:00
Chris Wilson	b5ffc9bc38	drm/i915: Introduce i915_gem_object_finish_gtt() Like its siblings finish_gpu(), this function clears the object from the GTT domain forcing it to be trigger a domain invalidation should we ever need to use via the GTT again. Note that the most important side-effect of finishing the GTT domain (aside from clearing the tracking read/write domains) is that it imposes an memory barrier so that all accesses are complete before it returns, which is important if you intend to be modifying translation tables shortly afterwards. The second most important side-effect is that it tears down the GTT mappings forcing a page-fault and invalidation on next user access to the object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:14 -07:00
Chris Wilson	a8198eea15	drm/i915: Introduce i915_gem_object_finish_gpu() ... reincarnated from i915_gem_object_flush_gpu(). The semantic difference is that after calling finish_gpu() the object no longer resides in any GPU domain, and so will cause the GPU caches to be invalidated if it is ever used again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 11:43:47 -07:00
Daniel Vetter	c8ebc2b076	drm/915: fix relaxed tiling on gen2: tile height A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows. Userspace was broken and assumed 8 rows. Chris Wilson noted that the kernel unfortunately can't reliable check that because libdrm rounds up the size to the next bucket. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-04 10:41:12 -07:00
Chris Wilson	c8cbbb8ba9	drm/i915: s/addr & ~PAGE_MASK/offset_in_page(addr)/ Convert our open coded offset_in_page() to the common macro. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-04 10:40:42 -07:00
Ying Han	1495f230fa	vmscan: change shrinker API by passing shrink_control struct Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:26 -07:00
Eric Anholt	25aebfc30b	drm/i915: Add support for fence registers on Ivybridge. The registers are the same as on Sandybridge. Fixes scrambled display in X when it does software drawing to the GTT, and scans the results out as tiled. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 18:12:51 -07:00
Eric Anholt	10ed13e4a5	drm/i915: Use existing function instead of open-coding fence reg clear. This is once less place to miss a new INTEL_INFO(dev)->gen update now. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 18:12:50 -07:00
Chris Wilson	9c23f7fc4c	drm/i915: Do not clflush snooped objects Rely on the GPU snooping into the CPU cache for appropriately bound objects on MI_FLUSH. Or perhaps one day we will have a cache-coherent CPU/GPU package... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:56:44 -07:00
Chris Wilson	93dfb40cd8	drm/i915: Rename agp_type to cache_level ... to clarify just how we use it inside the driver and remove the confusion of the poorly matching agp_type names. We still need to translate through agp_type for interface into the fake AGP driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:56:43 -07:00
Chris Wilson	f6e47884e7	drm/i915: Avoid unmapping pages from a NULL address space Found by gem_stress. As we perform retirement from a workqueue, it is possible for us to free and unbind objects after the last close on the device, and so after the address space has been torn down and reset to NULL: BUG: unable to handle kernel NULL pointer dereference at 00000054 IP: [<c1295a20>] mutex_lock+0xf/0x27 *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /sys/module/vt/parameters/default_utf8 Pid: 5, comm: kworker/u:0 Not tainted 2.6.38+ #214 EIP: 0060:[<c1295a20>] EFLAGS: 00010206 CPU: 1 EIP is at mutex_lock+0xf/0x27 EAX: 00000054 EBX: 00000054 ECX: 00000000 EDX: 00012fff ESI: 00000028 EDI: 00000000 EBP: f706fe20 ESP: f706fe18 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kworker/u:0 (pid: 5, ti=f706e000 task=f7060d00 task.ti=f706e000) Stack: f5aa3c60 00000000 f706fe74 c107e7df 00000246 dea55380 00000054 f5aa3c60 f706fe44 00000061 f70b4000 c13fff84 00000008 f706fe54 00000000 00000000 00012f00 00012fff 00000028 c109e575 f6b36700 00100000 00000000 f706fe90 Call Trace: [<c107e7df>] unmap_mapping_range+0x7d/0x1e6 [<c109e575>] ? mntput_no_expire+0x52/0xb6 [<c11c12f6>] i915_gem_release_mmap+0x49/0x58 [<c11c3449>] i915_gem_object_unbind+0x4c/0x125 [<c11c353f>] i915_gem_free_object_tail+0x1d/0xdb [<c11c55a2>] i915_gem_free_object+0x3d/0x41 [<c11a6be2>] ? drm_gem_object_free+0x0/0x27 [<c11a6c07>] drm_gem_object_free+0x25/0x27 [<c113c3ca>] kref_put+0x39/0x42 [<c11c0a59>] drm_gem_object_unreference+0x16/0x18 [<c11c0b15>] i915_gem_object_move_to_inactive+0xba/0xbe [<c11c0c87>] i915_gem_retire_requests_ring+0x16e/0x1a5 [<c11c3645>] i915_gem_retire_requests+0x48/0x63 [<c11c36ac>] i915_gem_retire_work_handler+0x4c/0x117 [<c10385d1>] process_one_work+0x140/0x21b [<c103734c>] ? __need_more_worker+0x13/0x2a [<c10373b1>] ? need_to_create_worker+0x1c/0x35 [<c11c3660>] ? i915_gem_retire_work_handler+0x0/0x117 [<c1038faf>] worker_thread+0xd4/0x14b [<c1038edb>] ? worker_thread+0x0/0x14b [<c103be1b>] kthread+0x68/0x6d [<c103bdb3>] ? kthread+0x0/0x6d [<c12970f6>] kernel_thread_helper+0x6/0x10 Code: 00 e8 98 fe ff ff 5d c3 55 89 e5 3e 8d 74 26 00 ba 01 00 00 00 e8 84 fe ff ff 5d c3 55 89 e5 53 8d 64 24 fc 3e 8d 74 26 00 89 c3 <f0> ff 08 79 05 e8 ab ff ff ff 89 e0 25 00 e0 ff ff 89 43 10 58 EIP: [<c1295a20>] mutex_lock+0xf/0x27 SS:ESP 0068:f706fe18 CR2: 0000000000000054 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>	2011-03-23 09:17:03 +00:00
Chris Wilson	26e12f8943	drm/i915: Fix use after free within tracepoint Detected by scripts/coccinelle/free/kfree.cocci. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>	2011-03-23 09:17:02 +00:00
Chris Wilson	36d527dead	drm/i915: Restore missing command flush before interrupt on BLT ring We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com	2011-03-23 09:17:01 +00:00
Chris Wilson	ed0291fd16	drm/i915: Fix computation of pitch for dumb bo creator Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-23 09:17:00 +00:00
Chris Wilson	29c5a58728	drm/i915: Fix tiling corruption from pipelined fencing ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. This resulted in us changing the fence register whilst the bo was active and so causing the blits to complete using the wrong stride or even the wrong tiling. (Visually the effect is that small blocks of the screen look like they have been interlaced). The fix is to wait for the GPU to finish using the memory region pointed to by the fence before changing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Dave Airlie <airlied@linux.ie>	2011-03-23 09:12:24 +00:00
Herton Ronaldo Krzesinski	09bfa51773	drm/i915: Prevent racy removal of request from client list When i915_gem_retire_requests_ring calls i915_gem_request_remove_from_client, the client_list for that request may already be removed in i915_gem_release. So we may call twice list_del(&request->client_list), resulting in an oops like this report: [126167.230394] BUG: unable to handle kernel paging request at 00100104 [126167.230699] IP: [<f8c2ce44>] i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.231042] pdpt = 00000000314c1001 pde = 0000000000000000 [126167.231314] Oops: 0002 [#1] SMP [126167.231471] last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/current_now [126167.231901] Modules linked in: snd_seq_dummy nls_utf8 isofs btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs cryptd aes_i586 aes_generic binfmt_misc vboxnetadp vboxnetflt vboxdrv parport_pc ppdev snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep arc4 snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev snd_timer snd_seq_device joydev iwlagn iwlcore mac80211 snd cfg80211 soundcore i915 drm_kms_helper snd_page_alloc psmouse drm serio_raw i2c_algo_bit video lp parport usbhid hid sky2 sdhci_pci ahci sdhci libahci [126167.232018] [126167.232018] Pid: 1101, comm: Xorg Not tainted 2.6.38-6-generic-pae #34-Ubuntu Gateway MC7833U / [126167.232018] EIP: 0060:[<f8c2ce44>] EFLAGS: 00213246 CPU: 0 [126167.232018] EIP is at i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.232018] EAX: 00200200 EBX: f1ac25b0 ECX: 00000040 EDX: 00100100 [126167.232018] ESI: f1a2801c EDI: e87fc060 EBP: ef4d7dd8 ESP: ef4d7db0 [126167.232018] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [126167.232018] Process Xorg (pid: 1101, ti=ef4d6000 task=f1ba6500 task.ti=ef4d6000) [126167.232018] Stack: [126167.232018] f1a28000 f1a2809c f1a28094 0058bd97 f1aa2400 f1a2801c 0058bd7b 0058bd85 [126167.232018] f1a2801c f1a28000 ef4d7e38 f8c2e995 ef4d7e30 ef4d7e60 c14d1ebc f6b3a040 [126167.232018] f1522cc0 000000db 00000000 f1ba6500 ffffffa1 00000000 00000001 f1a29214 [126167.232018] Call Trace: Unfortunately the call trace reported was cut, but looking at debug symbols the crash is at __list_del, when probably list_del is called twice on the same request->client_list, as the dereferenced value is LIST_POISON1 + 4, and by looking more at the debug symbols before list_del call it should have being called by i915_gem_request_remove_from_client And as I can see in the code, it seems we indeed have the possibility to remove a request->client_list twice, which would cause the above, because we do list_del(&request->client_list) on both i915_gem_request_remove_from_client and i915_gem_release As Chris Wilson pointed out, it's indeed the case: "(...) I had thought that the actual insertion/deletion was serialised under the struct mutex and the intention of the spinlock was to protect the unlocked list traversal during throttling. However, I missed that i915_gem_release() is also called without struct mutex and so we do need the double check for i915_gem_request_remove_from_client()." This change does the required check to avoid the duplicate remove of request->client_list. Bugzilla: http://bugs.launchpad.net/bugs/733780 Cc: stable@kernel.org # 2.6.38 Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-23 06:41:12 +00:00
Dave Airlie	34db18abd3	Merge remote branch 'intel/drm-intel-next' of ../drm-next into drm-core-next * 'intel/drm-intel-next' of ../drm-next: (755 commits) drm/i915: Only wait on a pending flip if we intend to write to the buffer drm/i915/dp: Sanity check eDP existence drm/i915: Rebind the buffer if its alignment constraints changes with tiling drm/i915: Disable GPU semaphores by default drm/i915: Do not overflow the MMADDR write FIFO Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing" drm/i915: Don't save/restore hardware status page address register drm/i915: don't store the reg value for HWS_PGA drm/i915: fix memory corruption with GM965 and >4GB RAM Linux 2.6.38-rc7 Revert "TPM: Long default timeout fix" drm/i915: Re-enable GPU semaphores for SandyBridge mobile drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#" Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU" drm/i915: Allow relocation deltas outside of target bo drm/i915: Silence an innocuous compiler warning for an unused variable fs/block_dev.c: fix new kernel-doc warning ACPI: Fix build for CONFIG_NET unset mm: <asm-generic/pgtable.h> must include <linux/mm_types.h> x86: Use u32 instead of long to set reset vector back to 0 ... Conflicts: drivers/gpu/drm/i915/i915_gem.c	2011-03-14 14:15:13 +10:00
Chris Wilson	47ae63e0c2	Merge branch 'drm-intel-fixes' into drm-intel-next Apply the trivial conflicting regression fixes, but keep GPU semaphores enabled. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem_execbuffer.c	2011-03-07 12:35:15 +00:00
Chris Wilson	467cffba85	drm/i915: Rebind the buffer if its alignment constraints changes with tiling Early gen3 and gen2 chipset do not have the relaxed per-surface tiling constraints of the later chipsets, so we need to check that the GTT alignment is correct for the new tiling. If it is not, we need to rebind. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-07 11:02:16 +00:00
Chris Wilson	ce453d81cb	drm/i915: Use a device flag for non-interruptible phases The code paths for modesetting are growing in complexity as we may need to move the buffers around in order to fit the scanout in the aperture. Therefore we face a choice as to whether to thread the interruptible status through the entire pinning and unbinding code paths or to add a flag to the device when we may not be interrupted by a signal. This does the latter and so fixes a few instances of modesetting failures under stress. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-22 15:56:25 +00:00
Chris Wilson	c872522663	drm/i915: Protect against drm_gem_object not being the first member Dave Airlie spotted that we had a potential bug should we ever rearrange the drm_i915_gem_object so not the base drm_gem_object was not its first member. He noticed that we often convert the return of drm_gem_object_lookup() immediately into drm_i915_gem_object and then check the result for nullity. This is only valid when the base object is the first member and so the superobject has the same address. Play safe instead and use the compiler to convert back to the original return address for sanity testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-22 15:55:57 +00:00
Chris Wilson	bed636abea	drm/i915: i915_mutex_interruptible() returns -EINTR ... so we handle that for i915_gem_fault() in the same manner as ERESTARTSYS, or we send a SIGBUS to the faulting application. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-11 20:32:44 +00:00
Chris Wilson	8d7e3de1e0	drm/i915: Skip the no-op domain changes when already in CPU\|GTT domains Removes some superfluous fluff from tracing... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 15:24:03 +00:00
Chris Wilson	db53a30261	drm/i915: Refine tracepoints A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:59:18 +00:00
Chris Wilson	d9bc7e9f32	drm/i915: Fix infinite loop regression from `21dd3734` By returning EAGAIN upon a wedged GPU before attempting to wait, we would hit an infinite loop of repeating operation without ever progressing. Instead this needs to be EIO so that userspace knows that the GPU is truly wedged and not in the process of error recovery. Similarly, we need to handle the error recovery during i915_gem_fault. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:33:55 +00:00
Dave Airlie	ff72145bad	drm: dumb scanout create/mmap for intel/radeon (v3) This is just an idea that might or might not be a good idea, it basically adds two ioctls to create a dumb and map a dumb buffer suitable for scanout. The handle can be passed to the KMS ioctls to create a framebuffer. It looks to me like it would be useful in the following cases: a) in development drivers - we can always provide a shadowfb fallback. b) libkms users - we can clean up libkms a lot and avoid linking to libdrm_*. c) plymouth via libkms is a lot easier. Userspace bits would be just calls + mmaps. We could probably mark these handles somehow as not being suitable for acceleartion so as top stop people who are dumber than dumb. Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-02-07 12:16:14 +10:00
Chris Wilson	21dd373486	drm/i915: Defer reporting EIO until we try to use the GPU Instead of reporting EIO upfront in the entrance of an ioctl that may or may not attempt to use the GPU, defer the actual detection of an invalid ioctl to when we issue a GPU instruction. This allows us to continue to use bo in video memory (via pread/pwrite and mmap) after the GPU has hung. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-27 11:06:07 +00:00
Chris Wilson	e110e8d672	drm/i915: Check wedged status before throttling Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-27 11:05:51 +00:00
Chris Wilson	29ee399131	drm/i915: Silence a few -Wunused-but-set-variable Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-25 10:33:11 +00:00
Chris Wilson	bee4a186c1	drm/i915,agp/intel: Do not clear stolen entries We can only utilize the stolen portion of the GTT if we are in sole charge of the hardware. This is only true if using GEM and KMS, otherwise VESA continues to access stolen memory. Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-24 18:26:25 +00:00
Chris Wilson	076e2c0eb8	drm/i915: Fix use of invalid array size for ring->sync_seqno There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we were clearing I915_NUM_RINGS of them. Oops. Reported-by: Jiri Slaby <jirislaby@gmail.com> Tested-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-23 12:52:11 +00:00
Chris Wilson	809b63349c	drm/i915: If we hit OOM when allocating GTT pages, clear the aperture Rather than evicting an object at random, which is unlikely to alleviate the memory pressure sufficient to allow us to continue, zap the entire aperture. That should give the system long enough to recover and reap some pages from the evicted objects, forestalling the allocation error for the new object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 22:55:48 +00:00
Chris Wilson	0a58705b2f	drm/i915: Periodically flush the active lists and requests In order to retire active buffers whilst no client is active, we need to insert our own flush requests onto the ring. This is useful for servers that queue up some rendering and then go to sleep as it allows us to the complete processing of those requests, potentially making that memory available again much earlier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 22:15:30 +00:00
Chris Wilson	882417851a	drm/i915: Propagate error from flushing the ring ... in order to avoid a BUG() and potential unbounded waits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:44:50 +00:00
Chris Wilson	b72f3acb71	drm/i915: Handle ringbuffer stalls when flushing Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:43:55 +00:00
Chris Wilson	63256ec534	drm/i915: Enforce write ordering through the GTT We need to ensure that writes through the GTT land before any modification to the MMIO registers and so must impose a mandatory write barrier when flushing the GTT domain. This was revealed by relaxing the write ordering by experimentally mapping the registers and the GATT as write-combining. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:42:53 +00:00
Chris Wilson	72bfa19c8d	drm/i915: Allow the application to choose the constant addressing mode The relative-to-general state default is useless as it means having to rewrite the streaming kernels for each batch. Relative-to-surface is more useful, as that stream usually needs to be rewritten for each batch. And absolute addressing mode, vital if you start streaming state, is also only available by adjusting the register... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-20 09:41:36 +00:00
Chris Wilson	b5ba177d8d	drm/i915: Poll for seqno completion if IRQ is disabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32288 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-14 12:19:25 +00:00
Chris Wilson	b13c2b96bf	drm/i915/ringbuffer: Make IRQ refcnting atomic In order to enforce the correct memory barriers for irq get/put, we need to perform the actual counting using atomic operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-14 11:34:46 +00:00
Chris Wilson	1a1c69762a	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_dp.c	2010-12-07 23:02:08 +00:00
Chris Wilson	7a1948768c	drm/i915: Emit a request to clear a flushed and idle ring for unbusy bo In order for bos to retire eventually, a request must be sent down the ring. This is expected, for example, by occlusion queries for which mesa will wait upon (whilst running glean) before issuing more batches and so the normal activity upon the ring is suspended and we need to emit a request to clear the idle ring. Reported-by: Jinjin, Wang <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-07 10:59:14 +00:00
Chris Wilson	0be732841f	drm/i915: Wait for the bo if a display flip is pipelined on the other ring Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-06 14:37:27 +00:00
Chris Wilson	0ac74c6b33	drm/i915: Only emit a flush if there is an outstanding gpu write Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-06 14:36:02 +00:00
Chris Wilson	6bda10d152	drm/i915: Completely disable fence pipelining. I'm still seeing tiling corruption of PutImage and CopyArea (I think) under mutter on pnv, so obviously the pipelining logic is deeply flawed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-05 23:19:37 +00:00
Chris Wilson	1ec14ad313	drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNB The bulk of the change is to convert the growing list of rings into an array so that the relationship between the rings and the semaphore sync registers can be easily computed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-05 00:37:38 +00:00
Chris Wilson	60de2ba51e	drm/i915: Kill the get_fence tracepoint As the tracepoint is now decoupled from when the actual register is assigned and was never complemented by detailing when the object lost its fence, it has outlived its limited usefulness. Profiling the actual stalls is a far more profitable venture anyway. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:20:47 +00:00
Chris Wilson	c6748e09ee	drm/i915: Remove inactive LRU tracking from set_domain_ioctl As the userspace mappings are torn down on every GPU write, we prefer to track when the buffer is activated (via a fresh i915_gem_fault). This makes the LRU conceptually simpler. With coherent mappings, the remaining use-case for set_domain_ioctl is GPU synchronisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:16:30 +00:00
Chris Wilson	d9e86c0ee6	drm/i915: Pipelined fencing [infrastructure] With this change, every batchbuffer can use all available fences (save pinned and scanout, of course) without ever stalling the gpu! In theory. Currently the actual pipelined update of the register is disabled due to some stability issues. However, just the deferred update is a significant win. Based on a series of patches by Daniel Vetter. The premise is that before every access to a buffer through the GTT we have to declare whether we need a register or not. If the access is by the GPU, a pipelined update to the register is made via the ringbuffer, and we track the last seqno of the batches that access it. If by the CPU we wait for the last GPU access and update the register (either to clear or to set it for the current buffer). One advantage of being able to pipeline changes is that we can defer the actual updating of the fence register until we first need to access the object through the GTT, i.e. we can eliminate the stall on set_tiling. This is important as the userspace bo cache does not track the tiling status of active buffers which generate frequent stalls on gen3 when enabling tiling for an already bound buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-12-02 10:07:05 +00:00
Chris Wilson	87ca9c8a7e	drm/i915: Prevent stalling for a GTT read back from a read-only GPU target Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:00:15 +00:00
Chris Wilson	7d2cb39c33	drm/i915: Release fenced GTT mapping on suspend ... so that upon first use after resume we will reacquire the fence reg. Reported-by: Keith Packard <keithp@keithp.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-28 16:12:15 +00:00
Chris Wilson	3619df035e	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c	2010-11-28 15:37:17 +00:00
Daniel Vetter	de18a29e0f	drm/i915: fix regression due to `ba3d8d749b` We don't track gpu flush request in any special way. So even with obj->write_domain == 0, a gpu flush might be outstanding but no yet executed. Even worse, the latest request might use the object only for reading. So and unconditional call to object_wait_rendering is needed for !pipelined. Hence revert that patch fully and untangle the flushing from the synchronization again. Reported-by: Keith Packard <keithp@keithp.com> Tested-by: Keith Packard <keithp@keithp.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-28 09:05:12 +00:00
Chris Wilson	432e58edc9	drm/i915: Avoid allocation for execbuffer object list Besides the minimal improvement in reducing the execbuffer overhead, the real benefit is clarifying a few routines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 21:19:26 +00:00
Chris Wilson	54cf91dc4e	drm/i915: Split i915_gem_execbuffer into its own file. A number of dragons have been seen lurking within the execbuffer code. The first step is then to isolate them from the rest and begin to scrutinise them in depth. Suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 21:19:25 +00:00
Chris Wilson	6299f992c0	drm/i915: Defer accounting until read from debugfs Simply remove our accounting of objects inside the aperture, keeping only track of what is in the aperture and its current usage. This removes the over-complication of BUGs that were attempting to keep the accounting correct and also removes the overhead of the accounting on the hot-paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:04:53 +00:00
Chris Wilson	2021746e1d	drm/i915: Mark a few functions as __must_check ... to benefit from the compiler checking that we remember to handle and propagate errors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:04:04 +00:00
Chris Wilson	312817a39f	drm/i915: Only save and restore fences for UMS With KMS, we can simply relinquish the fence when we idle the GPU and reassign it upon first use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:03:22 +00:00
Daniel Vetter	c6642782b9	drm/i915: Add a mechanism for pipelining fence register updates Not employed just yet... Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:01:39 +00:00
Chris Wilson	caea7476d4	drm/i915: More accurately track last fence usage by the GPU Based on a patch by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:52 +00:00
Chris Wilson	a7a09aebe8	drm/i915: Rework execbuffer pinning Avoid evicting buffers that will be used later in the batch in order to make room for the initial buffers by pinning all bound buffers in a single pass before binding (and evicting for) fresh buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:51 +00:00
Chris Wilson	919926aeb3	drm/i915: Thread the pipelining ring through the callers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:16 +00:00
Chris Wilson	dddbc0e525	drm/i915: Remove a defunct BUG_ON This used to check the precondition that all fences were to be located in a mappable area, redundant now as those two parameters are combined into one. After pinning, we assert that the buffer is bound into the desired region. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:15 +00:00
Chris Wilson	b6913e4bdb	drm/i915: Move the implementation details of PIPE_CONTROL to the ringbuffer The pipe control object is allocated by the device for the sole use of the render ringbuffer. Move this detail from the general code to the render ring buffer initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:14 +00:00
Chris Wilson	92b88aeb1a	drm/i915: Not all mappable regions require GTT fence regions Combining map_and_fenceable revealed a bug in i915_gem_object_gtt_size() in that it always computed the appropriate fence size for the object regardless of tiling state which caused us to over-allocate linear buffers when binding to the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:13 +00:00
Chris Wilson	05394f3975	drm/i915: Use drm_i915_gem_object as the preferred type A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and many characters! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:10 +00:00
Daniel Vetter	7c2e6fdf45	drm/i915: move gtt handling to i915_gem_gtt.c No more drm_*_agp in i915_gem.c! Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:47 +00:00
Daniel Vetter	93a37f20ea	drm/i915: track objects in the gtt This is required to restore gtt mappings on resume when agp is gone. The right way to do this would be to make sturct drm_mm_node embeddable and use the allocation list maintained by the drm memory manager. But that's a bigger project. Getting rid of the per bo agp_mem will save more memory than this wastes, anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:45 +00:00
Daniel Vetter	40ce657510	drm/i915/gtt: call chipset flush directly Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:44 +00:00
Daniel Vetter	23ed992a5e	drm/i915\|intel-gtt: consolidate intel-gtt.h headers ... and a few other defines. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:43 +00:00
Chris Wilson	e384eafc1c	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-11-23 20:13:13 +00:00
Chris Wilson	bcf50e2775	drm/i915: Handle pagefaults in execbuffer user relocations Currently if we hit a pagefault when applying a user relocation for the execbuffer, we bail and return EFAULT to the application. Instead, we need to unwind, drop the dev->struct_mutex, copy all the relocation entries to a vmalloc array (to avoid any potential circular deadlocks when resolving the pagefault), retake the mutex and then apply the relocations. Afterwards, we need to again drop the lock and copy the vmalloc array back to userspace. v2: Incorporate feedback from Daniel Vetter. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-23 20:11:43 +00:00
Chris Wilson	e624ae8e0d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c	2010-11-22 08:51:36 +00:00
Chris Wilson	d1d788302e	drm/i915: Prevent integer overflow when validating the execbuffer Commit `2549d6c2` removed the vmalloc used for temporary storage of the relocation lists used during execbuffer. However, our use of vmalloc was being protected by an integer overflow check which we do want to preserve! Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-21 09:30:58 +00:00
Chris Wilson	51311d0a5c	drm/i915: Do not hold mutex when faulting in user addresses Linus Torvalds found that it was rather trivial to trigger a system freeze: In fact, with lockdep, I don't even need to do the sysrq-d thing: it shows the bug as it happens. It's the X server taking the same lock recursively. Here's the problem: ============================================= [ INFO: possible recursive locking detected ] 2.6.37-rc2-00012-gbdbd01a #7 --------------------------------------------- Xorg/2816 is trying to acquire lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c626c>] i915_gem_fault+0x50/0x17e but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a other info that might help us debug this: 2 locks held by Xorg/2816: #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a #1: (&mm->mmap_sem){++++++}, at: [<ffffffff81022d4f>] page_fault+0x156/0x37b This recursion was introduced by rearranging the locking to avoid the double locking on the fast path (4f27b5d and `fbd5a26d`) and the introduction of the prefault to encourage the fast paths (b5e4f2b). In order to undo the problem, we rearrange the code to perform the access validation upfront, attempt to prefault and then fight for control of the mutex. the best case scenario where the mutex is uncontended the prefaulting is not wasted. Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-19 09:30:15 +00:00
Chris Wilson	c94f28c383	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_ringbuffer.c	2010-11-15 06:49:30 +00:00
Chris Wilson	1bb95834bb	Merge remote branch 'airlied/drm-fixes' into drm-intel-fixes	2010-11-15 06:33:11 +00:00
Daniel Vetter	5e78330126	drm/i915: fix relaxed tiling for gen <= 3 && !g33 g33/pineview doesn't have any alignment constrains for unfenced tiled buffers. But older chips have. Fix this. Problem introduced in `a00b10c360`. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-15 05:22:16 +00:00
Chris Wilson	85345517fe	drm/i915: Retire any pending operations on the old scanout when switching An old and oft reported bug, is that of the GPU hanging on a MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is waiting on a scanline counter on an inactive pipe, and so waits for a very long time until eventually the user reboots his machine. We can prevent this either by moving the WAIT into the kernel and thereby incurring considerable cost on every swapbuffers, or by waiting for the GPU to retire the last batch that accesses the framebuffer before installing a new one. As mode switches are much rarer than swap buffers, this looks like an easy choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-13 09:49:11 +00:00
Chris Wilson	5d97eb69bd	drm/i915: Only add the lazy request if we end up waiting for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-10 20:41:16 +00:00
Joe Perches	fce7d61be0	drivers/gpu/drm: Update WARN uses Coalesce long formats. Align arguments. Add missing newlines. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-11-09 13:37:15 +10:00
Chris Wilson	b47b30ccda	drm/i915: Avoid might_fault during pwrite whilst holding our mutex ... and so prevent a potential circular reference: [ INFO: possible circular locking dependency detected ] 2.6.37-rc1-uwe1+ #4 ------------------------------------------------------- Xorg/1401 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0 but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>] i915_mutex_lock_interruptible+0x3c/0x60 [i915] which lock already depends on the new lock. When the locking around the pwrite ioctl was simplified, I did not spot that the phys path never took any locks and so we introduced this potential circular reference. Reported-by: Uwe Helm <uwe.helm@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-08 09:19:11 +00:00
Chris Wilson	045e769ab6	drm/i915: Handle GPU hangs during fault gracefully. Instead of killing the process, just return no page found and reschedule the process giving the GPU some time to (hopefully) recover. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-07 09:18:22 +00:00
Daniel Vetter	75e9e9158f	drm/i915: kill mappable/fenceable disdinction `a00b10c360` "Only enforce fence limits inside the GTT" also added a fenceable/mappable disdinction when binding/pinning buffers. This only complicates the code with no pratical gain: - In execbuffer this matters on for g33/pineview, as this is the only chip that needs fences and has an unmappable gtt area. But fences are only possible in the mappable part of the gtt, so need_fence implies need_mappable. And need_mappable is only set independantly with relocations which implies (for sane userspace) that the buffer is untiled. - The overlay code is only really used on i8xx, which doesn't have unmappable gtt. And it doesn't support tiled buffers, currently. - For all other buffers it's a bug to pass in a tiled bo. In short, this disdinction doesn't have any practical gain. I've also reverted mapping the overlay and context pages as possibly unmappable. It's not worth being overtly clever here, all the big gains from unmappable are for execbuf bos. Also add a comment for a clever optimization that confused me while reading the original patch by Chris Wilson. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-04 19:02:03 +00:00
Chris Wilson	085ce26437	drm/i915: Ensure that if we ever try to pin+fence it is mappable. When merging Daniel's full-gtt patches I had a set of tweaks which I thought I had undone. I was half right... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31286 Reported-by: jinjin.wang@intel.com Reported-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-03 09:31:57 +00:00
Chris Wilson	f2a630bfec	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/i915_gem_evict.c	2010-11-01 13:44:41 +00:00
Chris Wilson	c6afd65807	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-01 13:39:24 +00:00
Chris Wilson	0f8c6d7ca9	drm/i915: Move the invalidate\|flush information out of the device struct ... and into a local structure scoped for the single function in which it is used. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:38:44 +00:00
Chris Wilson	13b2928933	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:31:19 +00:00
Chris Wilson	5eac3ab459	drm/i915: Evict just the purgeable GTT entries on the first pass Take two passes to evict everything whilst searching for sufficient free space to bind the batchbuffer. After searching for sufficient free space using LRU eviction, evict everything that is purgeable and try again. Only then if there is insufficient free space (or the GTT is too badly fragmented) evict everything from the aperture and try one last time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-31 12:31:30 +00:00
Chris Wilson	ff75b9bc48	drm/i915: Fix typo from `e5281ccd` in i915_gem_attach_phys_object() Accessing the uninitialised obj->pages instead of the local page lead to an OOPs. Reported-by: Xavier Chantry <chantry.xavier@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-30 22:52:31 +01:00
Chris Wilson	872d860c85	drm/i915: Remove the duplicate domain-change tracepoint for GPU flush Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:54 +01:00
Chris Wilson	a00b10c360	drm/i915: Only enforce fence limits inside the GTT. So long as we adhere to the fence registers rules for alignment and no overlaps (including with unfenced accesses to linear memory) and account for the tiled access in our size allocation, we do not have to allocate the full fenced region for the object. This allows us to fight the bloat tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside the GTT we still suffer the additional alignment constraints, so it doesn't magic allow us to render larger scenes without stalls -- we need the expanded GTT and fence pipelining to overcome those...] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:07 +01:00
Chris Wilson	7465378fd7	drm/i915: Convert BUG_ON(pin_count) from an impossible condition Also spotted by Dan Carpenter. obj->pin_count is unsigned so the BUG_ON(obj->pin_count<0) will never trigger. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 10:54:29 +01:00
Chris Wilson	bbe2e11a4b	drm/i915: Do not return -1 from shrinker when nr_to_scan == 0 The error code is only expected during the actual pruning and not during the first measurement (nr_to_scan == 0) pass. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 22:35:07 +01:00
Chris Wilson	395b70be54	drm/i915: Flush read-only buffers from the active list upon idle as well It is possible for the active list to only contain a read-only buffer so that the ring->gpu_write_list remains entry. This leads to an inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing an infinite spin during the shrinker and an assertion failure that i915_gpu_idle() does indeed flush all buffers from the active lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 21:31:19 +01:00
Chris Wilson	4a684a4117	drm/i915: Kill GTT mappings when moving from GTT domain In order to force a page-fault on a GTT mapping after we start using it from the GPU and so enforce correct CPU/GPU synchronisation, we need to invalidate the mapping. Pointed out by Owain G. Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:03 +01:00
Chris Wilson	e5281ccd2e	drm/i915: Eliminate nested get/put pages By using read_cache_page() for individual pages during pwrite/pread we can eliminate an unnecessary large allocation (and immediate free) of obj->pages. Also this eliminates any potential nesting of get/put pages, simplifying the code and preparing the path for greater things. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	39a01d1fb6	drm/i915: Remove mmap_offset Since we rarely use the mmap_offset and it is easily computable from the obj->map_list.hash, remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	17250b7155	drm/i915: Make the inactive object shrinker per-device Eliminate the racy device unload by embedding a shrinker into each device. Smaller, simpler code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:01 +01:00
Chris Wilson	da761a6edf	drm/i915: Bail early if we try to mmap an object too large to be mapped. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	fb7d516af1	drm/i915: add accounting for mappable objects in gtt v2 More precisely: For those that _need_ to be mappable. Also add two BUG_ONs in fault and pin to check the consistency of the mappable flag. Changes in v2: - Add tracking of gtt mappable space (to notice mappable/unmappable balancing issues). - Improve the mappable working set tracking by tracking fault and pin separately. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	ec57d2602a	drm/i915: add mappable to gem_object_bind tracepoint This way we can make some more educated guesses as to why exactly we can't use 2G apertures to their full potential ;) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:07 +01:00
Daniel Vetter	53984635a6	drm/i915: use the complete gtt At least the part that's currently enabled by the BIOS. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:06 +01:00
Daniel Vetter	16e809acc1	drm/i915: unbind unmappable objects on fault/pin In i915_gem_object_pin obviously unbind only if mappable is true. This is the last part to enable gtt_mappable_end != gtt_size, which the next patch will do. v2: Fences on g33/pineview only work in the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	920afa77ce	drm/i915: range-restricted bind_to_gtt Like before add a parameter mappable (also to gem_object_pin) and set it depending upon the context. Only bos that are brought into the gtt due to an execbuffer call can be put into the unmappable part of the gtt, everything else (especially pinned objects) need to be put into the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	a6e0aa4214	drm/i915: range-restricted eviction support Add a mappable parameter to i915_gem_evict_something to distinguish the two cases (non-restricted vs. mappable gtt allocations). No functional changes because the mappable limit is set to the end of the gtt currently. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:04 +01:00
Chris Wilson	3cce469cab	drm/i915: Propagate error from failing to queue a request Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:03 +01:00
Chris Wilson	b2223497b4	drm/i915: Remove the confusing global waiting/irq seqno Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:30:59 +01:00
Chris Wilson	7e318e18f2	drm/i915: Move object to GPU domains after dispatching execbuffer In the event that we fail to dispatch the execbuffer, for example if there is insufficient space on the ring, we were leaving the objects in an inconsistent state. Notably they were marked as being in the GPU write domain, but were not added to the ring or any list. This would lead to inevitable oops: [ 1010.522940] [drm:i915_gem_do_execbuffer] ERROR dispatch failed -16 [ 1010.523055] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088 [ 1010.523097] IP: [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523120] PGD 14cf2f067 PUD 14ce04067 PMD 0 [ 1010.523140] Oops: 0000 [#1] SMP [ 1010.523154] last sysfs file: /sys/devices/virtual/vc/vcsa2/uevent [ 1010.523173] CPU 0 [ 1010.523183] Pid: 716, comm: X Not tainted 2.6.36+ #34 LosLunas CRB/SandyBridge Platform [ 1010.523206] RIP: 0010:[<ffffffff8122d006>] [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523233] RSP: 0018:ffff88014bf97cd8 EFLAGS: 00010296 [ 1010.523249] RAX: ffff88014e2d1808 RBX: 0000000000000000 RCX: 0000000000000000 [ 1010.523270] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 [ 1010.523290] RBP: ffff88014e2d1000 R08: 0000000000000002 R09: 00000000400c645f [ 1010.523311] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 1010.523331] R13: ffff88014e29a000 R14: 00000000000000c8 R15: ffffffff8162eb28 [ 1010.523352] FS: 00007fc62379d700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 1010.523375] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1010.523392] CR2: 0000000000000088 CR3: 000000014bf87000 CR4: 00000000000406f0 [ 1010.523412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1010.523433] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1010.523454] Process X (pid: 716, threadinfo ffff88014bf96000, task ffff88014cc1ee40) [ 1010.523475] Stack: [ 1010.523483] ffff88014d5199c0 0000000000000200 0000000000000000 ffff88014bcc6400 [ 1010.523509] <0> 0000000000000000 0000000000000001 ffff88014e29a000 ffff88014bcc6400 [ 1010.523537] <0> ffffffff8162eb28 ffffffff8122faa8 ffff88014e29a000 ffff88014bcc6400 [ 1010.523568] Call Trace: [ 1010.523578] [<ffffffff8122faa8>] ? i915_gem_object_flush_gpu_write_domain+0x48/0x80 [ 1010.523601] [<ffffffff8122fb8e>] ? i915_gem_object_set_to_gtt_domain+0x2e/0xb0 [ 1010.523623] [<ffffffff8123113b>] ? i915_gem_set_domain_ioctl+0xdb/0x1f0 [ 1010.523644] [<ffffffff8120a3f1>] ? drm_ioctl+0x3d1/0x460 [ 1010.523660] [<ffffffff81231060>] ? i915_gem_set_domain_ioctl+0x0/0x1f0 [ 1010.523682] [<ffffffff81092618>] ? vma_prio_tree_insert+0x28/0x120 [ 1010.523701] [<ffffffff8109f379>] ? vma_link+0x99/0xf0 [ 1010.523717] [<ffffffff810a111d>] ? mmap_region+0x1ed/0x4f0 [ 1010.523734] [<ffffffff810c306f>] ? do_vfs_ioctl+0x9f/0x580 [ 1010.523750] [<ffffffff810c3599>] ? sys_ioctl+0x49/0x80 [ 1010.523767] [<ffffffff810022eb>] ? system_call_fastpath+0x16/0x1b [ 1010.523785] Code: 00 00 00 00 00 41 57 89 ce 41 56 41 55 41 54 45 89 c4 55 48 89 fd 53 48 89 d3 44 89 c2 48 89 df 4c 8d b3 c8 00 00 00 48 83 ec 18 <ff> 93 88 00 00 00 48 8b 83 c8 00 00 00 4c 8b bd 30 03 00 00 48 [ 1010.523946] RIP [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523966] RSP <ffff88014bf97cd8> [ 1010.523977] CR2: 0000000000000088 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	e1f99ce6ca	drm/i915: Propagate errors from writing to ringbuffer Preparing the ringbuffer for adding new commands can fail (a timeout whilst waiting for the GPU to catch up and free some space). So check for any potential error before overwriting HEAD with new commands, and propagate that error back to the user where possible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	78501eac34	drm/i915/ringbuffer: Drop the redundant dev from the vfunc interface The ringbuffer keeps a pointer to the parent device, so we can use that instead of passing around the pointer on the stack. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 12:18:21 +01:00
Linus Torvalds	c48c43e422	Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (476 commits) vmwgfx: Implement a proper GMR eviction mechanism drm/radeon/kms: fix r6xx/7xx 1D tiling CS checker v2 drm/radeon/kms: properly compute group_size on 6xx/7xx drm/radeon/kms: fix 2D tile height alignment in the r600 CS checker drm/radeon/kms/evergreen: set the clear state to the blit state drm/radeon/kms: don't poll dac load detect. gpu: Add Intel GMA500(Poulsbo) Stub Driver drm/radeon/kms: MC vram map needs to be >= pci aperture size drm/radeon/kms: implement display watermark support for evergreen drm/radeon/kms/evergreen: add some additional safe regs v2 drm/radeon/r600: fix tiling issues in CS checker. drm/i915: Move gpu_write_list to per-ring drm/i915: Invalidate the to-ring, flush the old-ring when updating domains drm/i915/ringbuffer: Write the value passed in to the tail register agp/intel: Restore valid PTE bit for Sandybridge after `bdd3072` drm/i915: Fix flushing regression from `9af90d19f` drm/i915/sdvo: Remove unused encoding member i915: enable AVI infoframe for intel_hdmi.c [v4] drm/i915: Fix current fb blocking for page flip drm/i915: IS_IRONLAKE is synonymous with gen == 5 ... Fix up conflicts in - drivers/gpu/drm/i915/{i915_gem.c, i915/intel_overlay.c}: due to the new simplified stack-based kmap_atomic() interface - drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: added .llseek entry due to BKL removal cleanups.	2010-10-26 18:57:59 -07:00
Peter Zijlstra	3e4d3af501	mm: stack based kmap_atomic() Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:08 -07:00
Chris Wilson	641934069d	drm/i915: Move gpu_write_list to per-ring ... to prevent flush processing of an idle (or even absent) ring. This fixes a regression during suspend from `87acb0a5`. Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Tested-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-24 20:22:51 +01:00
Chris Wilson	b6651458d3	drm/i915: Invalidate the to-ring, flush the old-ring when updating domains When the object has been written to by the gpu it remains on the ring until its flush has been retired. However, when the object is moving to the ring and the associated cache needs to be invalidated, we need to perform the flush on the target ring, not the one it came from (which is NULL in the reported case and so the flush was entirely absent). Reported-by: Peter Clifton <pcjc2@cam.ac.uk> Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-23 11:07:21 +01:00
Chris Wilson	878a3c37d3	drm/i915: Fix flushing regression from `9af90d19f` Whilst moving the code around in `9af90d19f`, I dropped the or'ing in of new write domains which would zero out the write domain for a render target if later reused as a source later in the batch. This meant that we might drop a required flush before reading from the render target. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31043 Reported-by: xunx.fang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-22 10:48:12 +01:00
Chris Wilson	549f736582	drm/i915: Enable SandyBridge blitter ring Based on an original patch by Zhenyu Wang, this initializes the BLT ring for SandyBridge and enables support for user execbuffers. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-21 19:08:39 +01:00
Chris Wilson	b5dc608c98	drm/i915: Copy the updated reloc->presumed_offset back to the user If the userspace driver is using a constant relocation array with a static buffer, they will pass the same relocation array back to the kernel. So we do need to update the presumed offset value in those relocations to reflect the current object so that they remain correct with future batchbuffers and we avoid the necessity of having to suspend execution and perform redundant relocations. Fixes the regression introduced by `12f889c` for applications using absolute addressing on trees of buffer (i.e. the current consumers of libdrm_intel.so). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30996 Reported-by: Wang, Jinjin <jinjin.wang@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 21:06:34 +01:00
Chris Wilson	69dc4987cb	drm/i915: Track objects in global active list (as well as per-ring) To handle retirements, we need per-ring tracking of active objects. To handle evictions, we need global tracking of active objects. As we enable more rings, rebuilding the global list from the individual per-ring lists quickly grows tiresome and overly complicated. Tracking the active objects in two lists is the lesser of two evils. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	87acb0a550	drm/i915: Simplify most HAS_BSD() checks ... by always initialising the empty ringbuffer it is always then safe to check whether it is active. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	9af90d19f8	drm/i915: cache the last object lookup during pin_and_relocate() The most frequent relocation within a batchbuffer is a contiguous sequence of vertex buffer relocations, for which we can virtually eliminate the drm_gem_object_lookup() overhead by caching the last handle to object translation. In doing so we refactor the pin and relocate retry loop out of do_execbuffer into its own helper function and so improve the error paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:50 +01:00
Chris Wilson	1d7cfea152	drm/i915: Do interrupible mutex lock first to avoid locking for unreference One of the primarily consumers of the i915 driver is X, a large signal driven application. Frequently when writing into the buffers, there is a pending signal which causes us not to take the interruptible lock but then we need to take that same lock around the object unreference. By rearranging the code to do the interruptible lock as the first check, we can avoid the frequent additional locking around the unreference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:20:23 +01:00
Chris Wilson	4f27b75d56	drm/i915: rearrange mutex acquisition for pread ... to avoid the double acquisition along fast[er] paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:55 +01:00
Chris Wilson	fbd5a26d50	drm/i915: Rearrange acquisition of mutex during pwrite ... to avoid reacquiring it to drop the object reference count on exit. Note we have to make sure we now drop (and reacquire) the lock around acquiring the mm semaphore on the slow paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:47 +01:00
Chris Wilson	b5e4feb661	drm/i915: Attempt to prefault user pages for pread/pwrite ... in the hope that it makes the atomic fast paths more likely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:37 +01:00
Chris Wilson	202f2fef7a	drm/i915: Avoid taking the mutex for dropping the refcnt upon creation After allocation a handle for the fresh object, we know that we can safely drop the refcnt without triggering a free so we do not need the mutex. Strangely, this mutex acquisition is the one that appears on driver profiles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:28 +01:00
Chris Wilson	f0c43d9b7e	drm/i915: Perform relocations in CPU domain [if in CPU domain] Avoid an early eviction of the batch buffer into the uncached GTT domain, and so do the relocation fixup in cacheable memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:18 +01:00
Chris Wilson	2549d6c26c	drm/i915: Avoid vmallocing a buffer for the relocations ... perform an access validation check up front instead and copy them in on-demand, during i915_gem_object_pin_and_relocate(). As around 20% of the CPU overhead may be spent inside vmalloc for the relocation entries when submitting an execbuffer [for x11perf -aa10text], the savings are considerable and result in around a 10% throughput increase [for glyphs]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:18:36 +01:00
Chris Wilson	e59f2bac15	drm/i915: Wait for pending flips on the GPU Currently, if a batch buffer refers to an object with a pending flip, then we sleep until that pending flip is completed (unpinned and signalled). This is so that a flip can be queued and the user can continue rendering to the backbuffer oblivious to whether the buffer is still pinned as the scan out. (The kernel arbitrating at the last moment to stall the batch and wait until the buffer is unpinned and replaced as the front buffer.) As we only have a queue depth of 1, we can simply wait for the current pending flip to complete and continue rendering. We can achieve this with a single WAIT_FOR_EVENT command inserted into the ring buffer prior to executing the batch, without stalling the client. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-07 19:10:09 +01:00
Dave Airlie	fb7ba2114b	Merge remote branch 'korg/drm-fixes' into drm-vmware-next necessary for some of the vmware fixes to be pushed in. Conflicts: drivers/gpu/drm/drm_gem.c drivers/gpu/drm/i915/intel_fb.c include/drm/drmP.h	2010-10-06 11:10:48 +10:00
Linus Torvalds	c470af0a27	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow drm/i915: Sanity check pread/pwrite drm/i915: Use pipe state to tell when pipe is off drm/i915: vblank status not valid while training display port drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code drm/i915: Fix refleak during eviction. drm/i915: fix GMCH power reporting	2010-10-04 11:10:26 -07:00
Chris Wilson	35b62a89b0	drm/i915: Skip pread/pwrite if size to copy is 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-04 10:07:46 +01:00
Chris Wilson	df6d075a4d	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-10-04 10:07:38 +01:00
Chris Wilson	7dcd2499de	drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow ... and do the same for pread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:18 +01:00
Chris Wilson	ce9d419dbe	drm/i915: Sanity check pread/pwrite Move the access control up from the fast paths, which are no longer universally taken first, up into the caller. This then duplicates some sanity checking along the slow paths, but is much simpler. Tracked as CVE-2010-2962. Reported-by: Kees Cook <kees@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:17 +01:00
Chris Wilson	58e10eb92d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem_evict.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c	2010-10-03 10:56:11 +01:00
Julia Lawall	929f49bf22	drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code Extend the error handling code with operations found in other nearby error handling code A simplified version of the sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ @r@ statement S1,S2,S3; constant C1,C2,C3; @@ if (...) {... S1 return -C1;} ... if (...) {... when != S1 return -C2;} ... *if (...) {... S1 return -C3;} // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-02 15:21:26 +01:00
Chris Wilson	1cdf7fef79	drm/i915: Don't mask the return code whilst relocating. The return from move_to_gtt_domain() may indicate a pending signal which needs to handled as opposed to an actual error, for instance, so report the original return value rather than forcing an EINVAL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-02 15:12:41 +01:00
Linus Torvalds	18ffe4b18c	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: vmwgfx: Fix fb VRAM pinning failure due to fragmentation vmwgfx: Remove initialisation of dev::devname vmwgfx: Enable use of the vblank system vmwgfx: vt-switch (master drop) fixes drm/vmwgfx: Fix breakage introduced by commit "drm: block userspace under allocating buffer and having drivers overwrite it (v2)" drm: Hold the mutex when dropping the last GEM reference (v2) drm/gem: handlecount isn't really a kref so don't make it one. drm: i810/i830: fix locked ioctl variant drm/radeon/kms: add quirk for MSI K9A2GM motherboard drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle drm: Prune GEM vma entries drm/radeon/kms: fix up encoder info messages for DFP6 drm/radeon: fix PCI ID 5657 to be an RV410	2010-10-01 10:58:31 -07:00
Chris Wilson	069efc1dac	drm/i915: Clear fence registers on GPU reset When the GPU is reset, the fence registers are invalidated, so release the objects and clear them out. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:22 +01:00
Chris Wilson	812ed49243	drm/i915: Force the domain to CPU on unbinding whilst wedged. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30083 Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:21 +01:00
Chris Wilson	73aa808f10	drm: Move the GTT accounting to i915 Only drm/i915 does the bookkeeping that makes the information useful, and the information maintained is driver specific, so move it out of the core and into its single user. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com>	2010-10-01 14:45:20 +01:00
Dave Airlie	29d08b3efd	drm/gem: handlecount isn't really a kref so don't make it one. There were lots of places being inconsistent since handle count looked like a kref but it really wasn't. Fix this my just making handle count an atomic on the object, and have it increase the normal object kref. Now i915/radeon/nouveau drivers can drop the normal reference on userspace object creation, and have the handle hold it. This patch fixes a memory leak or corruption on unload, because the driver had no way of knowing if a handle had been actually added for this object, and the fbcon object needed to know this to clean itself up properly. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-10-01 09:17:44 +10:00
Chris Wilson	f394940b8d	drm/i915: Remove redundant deletion of obj->gpu_write_list At that point as the object is no longer in any GPU write domain it must not be on the list, so the list_del() is redundant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:51 +01:00
Chris Wilson	5cdf588174	drm/i915: Make get/put pages static Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:13 +01:00
Chris Wilson	23bc598253	drm/i915/debug: Convert i915_verify_active() to scan all lists ... and check more regularly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:11 +01:00
Chris Wilson	891b48cfc8	drm/i915: Avoid blocking the kworker thread on a stuck mutex Just reschedule the retire requests again if the device is currently busy. The request list will be pruned along other paths so will never grow unbounded and so we can afford to miss the occasional pruning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 12:26:37 +01:00
Chris Wilson	3d2a812ae4	drm/i915/debug: Remove default WATCH_BUF Replaced by tracepoints. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:19 +01:00
Chris Wilson	97d1ebaf81	drm/i915/debug: Remove defunct WATCH_LRU This has bitrotted through inuse and superseded by tracing and debugfs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:18 +01:00
Chris Wilson	e0e41598b4	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-09-28 15:48:38 +01:00
Chris Wilson	a56ba56c27	Revert "drm/i915: Drop ring->lazy_request" With multiple rings generating requests independently, the outstanding requests must also be track independently. Reported-by: Wang Jinjin <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-28 11:30:52 +01:00
Chris Wilson	ced270fa89	drm/i915: Ensure that the mode change flushing is currently uninterruptible Introduced by `48b956c5`, I had thought I had already fixed this. Oh well. Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 22:50:36 +01:00
Chris Wilson	1c25595f8d	drm/i915: Convert the file mutex into a spinlock Daniel Vetter pointed out that in this case is would be clearer and cleaner to use a spinlock instead of a mutex to protect the per-file request list manipulation. Make it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 11:03:27 +01:00
Chris Wilson	76c1dec197	drm/i915: Make the mutex_lock interruptible on ioctl paths ... and combine it with the wedged completion handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	30dbf0c07f	drm/i915: Adjust hangcheck EIO semantics Owain Ainsworth reported an issue between the interaction of the hangcheck and userspace immediately (and permanently) falling back to s/w rasterisation. In order to break the mutex and begin resetting the GPU, we must abort the current operation (usually within the wait) and climb sufficiently far back up the call chain to drop the mutex. In his implementation, Owain has a loop within the ioctl handler to detect the hang and then sleep until the error handler has run. I've chosen to return to userspace and report an EAGAIN which should trigger the userspace ioctl handler to repeat the call (simply because it felt less invasive...). Before hitting a wedged GPU, we then wait upon completion of the error handler. Reported-by: Owain G. Ainsworth <zerooa@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	f787a5f59e	drm/i915: Only hold a process-local lock whilst throttling. Avoid cause latencies in other clients by not taking the global struct mutex and moving the per-client request manipulation a local per-client mutex. For example, this allows a compositor to schedule a page-flip (through X) whilst an OpenGL application is monopolising the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 21:03:00 +01:00
Chris Wilson	e6c3a2a6d3	drm/i915: Use an uninterruptible wait for page-flips during modeset We need to drain the pending flips prior to disabling the pipe during modeset, and these need to be done in an uninterruptible fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 14:19:57 +01:00
Chris Wilson	20f0cd55f6	drm/i915: Remove the broken flush_ring from page-flip This is already performed with the pipelined flush, so by the time we schedule the flush in the page-flip, the ring is NULL and we OOPs instead. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-23 11:02:55 +01:00
Chris Wilson	9b74f7348f	drm/i915: Fix 945GM regression in `e259befd` A minor typo caused a single fence register to be incorrectly programmed, resulting in occassional tiling corruption. Reported-and-tested-by: Hans de Bruin <bruinjm@xs4all.nl> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=18962 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-23 10:30:57 +01:00
Chris Wilson	5c12a07e80	drm/i915: Drop ring->lazy_request We are not currently using it as intended, so remove the complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-22 11:58:55 +01:00
Chris Wilson	dfaae392f4	drm/i915: Clear the gpu_write_list on resetting write_domain upon hang Otherwise we will hit a list handling assertion when moving the object to the inactive list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-22 10:31:52 +01:00
Chris Wilson	9e0ae53404	drm/i915: Don't overwrite the returned error-code During i915_gem_create_mmap_offset() if the subsystem reports an error code, use it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 15:05:24 +01:00
Chris Wilson	f13d3f7311	drm/i915: Track pinned objects Keep a list of pinned objects and display it via debugfs. Now all objects that exist in the GTT are always tracked on one of the active, flushing, inactive or pinned lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:17 +01:00
Chris Wilson	265db9585e	drm/i915: Drain any pending flips on the fb prior to unpinning If we have queued a page flip on the current fb and then request a mode change, wait until the page flip completes before performing the new request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:17 +01:00
Chris Wilson	c78ec30bba	drm/i915: Merge ring flushing and lazy requests Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:16 +01:00
Chris Wilson	53640e1d07	drm/i915: Track gpu fence usage Track if the gpu requires the fence for the execution of a batch buffer and so only wait upon the retirement of the object's last rendering seqno if the fence is in use by the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:20:54 +01:00
Chris Wilson	c7f9f9a8b8	drm/i915: Use ring->flush() instead of MI_FLUSH Use the ring abstraction to hide the details of having choose the appropriate flushing method. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:59 +01:00
Xiang, Haihao	5c1143bbec	drm/i915: do not export the instances of struct intel_ring_buffer Introduce intel_init_render_ring_buffer(), intel_init_bsd_ring_buffer for ring initialization. Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:55 +01:00
Chris Wilson	77f0123022	drm/i915: Clear GPU read domains on reset Clear the GPU read domain for the inactive objects on a reset so that they are correctly invalidated on reuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:53 +01:00
Chris Wilson	9375e446e7	drm/i915: Clear flushing lists on GPU reset Owain Ainsworth noticed that the reset code failed to clear the flushing list leaving the driver in an inconsistent state following a hung GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:52 +01:00
Chris Wilson	9220434a87	drm/i915: Only emit a flush request on the active ring. When flushing the GPU domains,we emit a flush on both rings, even though they share a unified cache. Only emit the flush on the currently active ring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:51 +01:00
Chris Wilson	b84d5f0c22	drm/i915: Inline i915_gem_ring_retire_request() Change the semantics to retire any buffer older than the current seqno rather than repeatedly calling calling the function to retire the buffer at the head of the list matching the request seqno. Whilst this should have no semantic impact on the implementation, Daniel was wondering if there was a bug where we might miss a retirement and so end up with a continually growing active list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:50 +01:00
Chris Wilson	a6c45cf013	drm/i915: INTEL_INFO->gen supercedes i8xx, i9xx, i965g Avoid confusion between i965g meaning broadwater and the gen4+ chipset families. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:45 +01:00
Chris Wilson	e9e5f8e8d3	Merge branch 'drm-intel-fixes' into HEAD Conflicts: drivers/char/agp/intel-agp.c drivers/gpu/drm/i915/intel_crt.c	2010-09-21 11:19:32 +01:00
Chris Wilson	e259befd90	drm/i915: Fix Sandybridge fence registers With 5 places to update when adding handling for fence registers, it is easy to overlook one or two. Correct that oversight, but fence management should be improved before a new set of registers is added. Bugzilla: https://bugs.freedesktop.org/show_bug?id=30199 Original patch by: Yuanhan Liu <yuanhan.liu@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-17 08:18:30 +01:00
Chris Wilson	2b6efaa476	drm/i915: Remove unused intel_ringbuffer->ring_flag This can always be re-added should somebody find a use... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:13:00 +01:00
Chris Wilson	2cf34d7b7e	drm/i915: Allow get_fence_reg() to be uninterruptible As we currently may need to acquire a fence register during a modeset, we need to be able to do so in an uninterruptible manner. So expose that parameter to the callers of the fence management code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:08:36 +01:00
Chris Wilson	48b956c5a8	drm/i915: Push pipelining of display plane flushes to the caller This ensures that we do wait upon the flushes to complete if necessary and avoid the visual tears, whilst enabling pipelined page-flips. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:08:35 +01:00
Chris Wilson	0bc23aad3b	drm/i915: Fix regression in `ba3d8d749b` I pulled the wrong version of the patch from Daniel Vetter which was missing the read barriers -- and the one that was causing all the trouble was from i915_gem_object_put_fence_reg(), leading to GPU hangs on gen3. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:35:43 +01:00
Chris Wilson	7213342db5	drm/i915: Consolidate flushing the display plane Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:34:27 +01:00
Chris Wilson	b3b079dbef	drm/i915: Reduce hangcheck frequency By reducing the hangcheck frequency we check less often, conserving resources, and still detect a lock up quickly. On a fast machine with a slow GPU (like a Core2 paired with a 945G) it is easy for the hangcheck to misfire as we check too fast. Also once hung and if we fail to completely reset the chip, we have a nasty habit of proclaming a hang many times a second and generating a strobe-like display. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:30:10 +01:00
Chris Wilson	995b6762f0	drm/i915: Quieten sparse warnings for missing prototypes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:57 +01:00
Chris Wilson	de227ef090	drm/i915: Kill the active list spinlock This spinlock only served debugging purposes in a time when we could not be sure of the mutex ever being released upon a GPU hang. As we now should be able rely on hangcheck to do the job for us (and that error reporting should not itself require the struct mutex) we can kill the incomplete attempt at protection. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:56 +01:00
Chris Wilson	8dc5d14741	drm/i915: Preallocate requests By allocating the request prior to writing to the ringbuffer, we can abort the operation without leaving the GPU in an inconsistent state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-09-08 10:23:50 +01:00
Daniel Vetter	4fc6ee7646	drm/i915: drop i915_add_request right in front of i915_wait_request ... take advantage of the new implicit request issuing of i915_wait_request. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:39 +01:00
Daniel Vetter	ba3d8d749b	drm/i915: move the wait_rendering call into flush_gpu_write_domain One caller (for the pageflip support) wants a purely pipelined flush. Distinguish this case by a new parameter. This will also be useful later on for pipelined fencing. v2: Simplify the code by depending upon the implicit request emitting of i915_wait_request. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [ickle: And drop the non-interruptible support in the process.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:38 +01:00
Daniel Vetter	617dbe2787	drm/i915: drop seqno argument from i915_gem_object_move_to_active By moving one i915_add_request we can solely depend on the new auto-seqno-numbering behaviour. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:37 +01:00
Daniel Vetter	86394c669a	drm/i915: kill a no longer necessary BUG_ON i915_gem_object_move_to_active can handle zero seqno for us now. And not emitting a request is not fatal here - we'll try to emit a new one if we have to wait for some rendering to complete. In case this assumption ever gets accidentally broken, there's already a BUG_ON to catch it in i915_do_wait_request. So just silently ignore ENOMEM here instead of screwing up the whole drm. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:37 +01:00
Daniel Vetter	8a1a49f954	drm/i915: move flushing list processing to i915_retire_commands ... instead of threading flush_domains through the execbuf code to i915_add_request. With this change 2 small cleanups are possible (likewise the majority of the patch): - The flush_domains parameter of i915_add_request is always 0. Drop it and the corresponding logic. - Ditto for the seqno param of i915_gem_process_flushing_list. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:36 +01:00
Daniel Vetter	a6910434e1	drm/i915: only one interrupt per batchbuffer is not enough! Previously I thought that one interrupt per batchbuffer should be enough. Now tedious benchmarking showed this to be wrong. Therefore track whether any commands have been isssued with a future seqno (like pipelined fencing changes or flushes). If this is the case emit a request before issueing the batchbuffer. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:35 +01:00
Daniel Vetter	8bff917c93	drm/i915: move flushing list processing to i915_gem_flush Now that we can move objects to the active list without already having emitted a request, move the flushing list handling into i915_gem_flush. This makes more sense and allows to drop a few i915_add_request calls that are not strictly necessary. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:35 +01:00
Daniel Vetter	e35a41de39	drm/i915: allow lazy emitting of requests Sometimes (like when flushing in preparation of batchbuffer execution) we know that we'll emit a request but haven't yet done so. Allow this case by simply taking the next seqno by default. Ensure that a request is eventually emitted before waiting for an request by issuing it in i915_wait_request iff this is not yet done. Also replace one open-coded version of i915_gem_object_wait_rendering, to prevent future code-diversion. Chris Wilson asked me to explain and clarify what this patch does and why. Here it goes: Old way of moving objects onto the active list and associating them with a reques: 1. i915_add_request + store the returned seqno somewhere 2. i915_gem_object_move_to_active (with the stored seqno as parameter) For the current users, this is all fine. But I'd like to associate objects (and fence regs) with the batchbuffer request deep down in the execbuf call-chain. I thought about three ways of implementing this. a) Don't care, just emit request when we need a new seqno. When heavily pipelining fence reg changes, this would have caused tons of superflous request (and corresponding irqs). b) Thread all changed fences, objects, whatever through the execbuf-maze, so that when we emit a request, we can store the new seqno at all the right places. c) Kill that seqno-threading-around business by simply storing the next seqno, i.e. allow 2. to be done before 1. in the above sequence. I've decided to implement c) (in this patch). The following patches are just fall-out that resulted from this small conceptual change. * We can handle the flushing list processing where we actually emit a flush (i915_gem_flush and i915_retire_commands) instead of in i915_add_request. The code makes IMHO more sense this way (and i915_add_request looses the flush_domains parameter, obviously). * We can avoid emitting unnecessary requests. IMHO there's no point in emitting more than one request per batchbuffer (with or without an corresponding irq). * By enforcing 2. before 1. ordering in the above sequence the seqno argument of i915_gem_object_move_to_active is redundant and can be dropped. v2: Now i915_wait_request issues request if it is not yet emitted. Also introduce i915_gem_next_request_seqno(dev) just in case we ever need to do some prep work before using a new seqno. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [ickle: Keep i915_gem_object_set_to_display_plane() uninterruptible.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:34 +01:00
Daniel Vetter	75ef9da2cd	drm/i915: unload: fix retire_work races ums-gem code correctly cancels the retire work (at lastclose time), kms does not do so. Fix this by canceling the work right after ideling the gpu. While staring at the code I noticed that the work function is not static. Fix this, too. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:13:28 +01:00
Daniel Vetter	bc0c7f1443	drm/i915: unload: fix error_work races This is the first patch to clean up module unload races due to outstanding timers/work. Preparatory step: Thou shalt not destroy the workqueue when new work might still get enqued. Now error_work gets queued by the hangcheck timer and only (atomically) reads the chip wedged status. So cancel it right after the hangcheck timer is killed. But the hangcheck is armed by interrupts, so move everything after irqs are disabled. Also change a del_timer to a del_timer_sync in the ums gem code, the hangcheck timer is self-rearming. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:13:24 +01:00
Zhenyu Wang	f8f235e5bb	agp/intel: Fix cache control for Sandybridge Sandybridge GTT has new cache control bits in PTE, which controls graphics page cache in LLC or LLC/MLC, so we need to extend the mask function to respect the new bits. And set cache control to always LLC only by default on Gen6. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-07 11:16:43 +01:00
Dan Carpenter	c877cdce93	i915: return -EFAULT if copy_to_user fails copy_to_user() returns the number of bytes remaining to be copied and I'm pretty sure we want to return a negative error code here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-06 23:09:54 +01:00
Chris Wilson	1dfd9754cd	Revert "drm/i915: Unreference object not handle on creation" This reverts commit `86f100b136`. The kref API requires the handlecount to be initialised to one on object creation (so that kref_get() doesn't complain upon first use) so the dalliance in the drivers is required in order to sink the initial floating reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-06 23:09:49 +01:00
Linus Torvalds	4238a417a9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (58 commits) drm/i915,intel_agp: Add support for Sandybridge D0 drm/i915: fix render pipe control notify on sandybridge agp/intel: set 40-bit dma mask on Sandybridge drm/i915: Remove the conflicting BUG_ON() drm/i915/suspend: s/IS_IRONLAKE/HAS_PCH_SPLIT/ drm/i915/suspend: Flush register writes before busy-waiting. i915: disable DAC on Ironlake also when doing CRT load detection. drm/i915: wait for actual vblank, not just 20ms drm/i915: make sure eDP PLL is enabled at the right time drm/i915: fix VGA plane disable for Ironlake+ drm/i915: eDP mode set sequence corrections drm/i915: add panel reset workaround drm/i915: Enable RC6 on Ironlake. drm/i915/sdvo: Only set is_lvds if we have a valid fixed mode. drm/i915: Set up a render context on Ironlake drm/i915 invalidate indirect state pointers at end of ring exec drm/i915: Wake-up wait_request() from elapsed hang-check (v2) drm/i915: Apply i830 errata for cursor alignment drm/i915: Only update i845/i865 CURBASE when disabled (v2) drm/i915: FBC is updated within set_base() so remove second call in mode_set() ...	2010-08-22 11:03:27 -07:00
Chris Wilson	156dadc180	drm/i915: Remove the conflicting BUG_ON() We now attempt to free "active" objects following a GPU hang as either the GPU will be reset or the hang is permenant. In either case, the GPU writes will not be flushed to main memory and it should be safe to return that memory back to the system. The BUG_ON(active) is thus overkill and can erroneously fire after a EIO. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-21 23:21:13 -07:00
Chris Wilson	bf79cb914d	drm: Use ENOENT consistently for the error return for an unmatched handle. This is consistent with trying to access a filename that not exist within a directory which is a good analogy here. The main reason for the change is that it is easy to confuse the error code of EBADF as an performing an ioctl on an invalid file descriptor (rather than an unknown object). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-08-10 10:46:55 +10:00
Chris Wilson	6eeefaf3c8	drm/i915: Apply i830 errata for cursor alignment i830 requires 32bpp cursors to be aligned to 16KB, so we have to expose the alignment parameter to i915_gem_attach_phys_object(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:36 -07:00
Chris Wilson	ae9fed6b60	drm/i915: Truncate the shmem backing pages on purge shmfs doesn't actually implement i_ops->truncate() so we were not immedatiately releasing the backing pages when shrinking the gfx cache under OOM. Instead use a combination of truncate_inode_pages() and i_ops->truncate_range() as is used by shmem_delete_inode(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:34 -07:00
Chris Wilson	7d1c4804ae	drm/i915: Maintain LRU order of inactive objects upon access by CPU (v2) In order to reduce the penalty of fallbacks under memory pressure and to avoid a potential immediate ping-pong of evicting a mmaped buffer, we move the object to the tail of the inactive list when a page is freshly faulted or the object is moved into the CPU domain. We choose not to protect the CPU objects from casual eviction, preferring to keep the GPU active for as long as possible. v2: Daniel Vetter found a bug where I forgot that pinned objects are kept off the inactive list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:33 -07:00
Chris Wilson	b47eb4a2b3	drm/i915: Move the eviction logic to its own file. The eviction code is the gnarly underbelly of memory management, and is clearer if kept separated from the normal domain management in GEM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	6f392d5486	drm/i915: Use a common seqno for all rings. This will be used by the eviction logic to maintain fairness between the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Daniel Vetter	0108a3edd5	drm/i915: prepare for fair lru eviction This does two little changes: - Add an alignment parameter for evict_something. It's not really great to whack a carefully sized hole into the gtt with the wrong alignment. Especially since the fallback path is a full evict. - With the inactive scan stuff we need to evict more that one object, so move the unbind call into the helper function that scans for the object to be evicted, too. And adjust its name. No functional changes in this patch, just preparation. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	bf1a109239	drm/i915: Append the object onto the inactive list on binding. In order to properly track bound objects, they need to exist on one of the inactive/active lists or be pinned. As this is a requirement, do the work inside i915_gem_bind_to_gtt() rather than dotted around the callsites. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	ae7d49d879	drm/i915: Emit a backtrace if we attempt to rebind a pinned buffer This debugging trace was useful for finding the fbcon regression on i965, and it may prove useful again in future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:30 -07:00
Chris Wilson	0be555b66a	drm/i915: report all active objects as busy Incorporates a similar patch by Daniel Vetter, the alteration being to report the current busy state after retiring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:30 -07:00
Chris Wilson	88f356b725	drm/i915: Only emit flushes on active rings. This avoids the excess flush and requests on idle rings (and spamming the debug log ;-) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:29 -07:00
Chris Wilson	fca3ec01e0	drm,io-mapping: Specify slot to use for atomic mappings This is required should we ever attempt to use an io-mapping where KM_USER0 is verboten, such as inside an IRQ context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-08-05 08:48:53 +10:00
Chris Wilson	86f100b136	drm/i915: Unreference object not handle on creation When creating an object, we create the handle by which it is known to the process and which own the reference to the object. That reference to the new handle is what we want to transfer to the process, not the lost reference to the object; so free the local object reference not the process's handle reference. This brings i915_gem_object_create_ioctl() into line with drm_gem_open_ioctl() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:58:06 -07:00
Chris Wilson	8dc1775dce	drm/i915: Attempt to uncouple object after catastrophic failure in unbind If we fail to flush outstanding GPU writes but return the memory to the system, we risk corrupting memory should the GPU recovery and complete those writes. On the other hand, if we bail early and free the object then we have a definite use-after-free and real memory corruption. Choose the lesser of two evils, since in order to recover from the hung GPU we need to completely reset it, those pending writes should never happen. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:56:29 -07:00
Chris Wilson	be72615bcf	drm/i915: Repeat unbinding during free if interrupted (v6) If during the freeing of an object the unbind is interrupted by a system call, which is quite possible if we have outstanding GPU writes that must be flushed, the unbind is silently aborted. This still leaves the AGP region and backing pages allocated, and perhaps more importantly, the object remains upon the various lists exposing us to memory corruption. I think this is the cause behind the use-after-free, such as Bug 15664 - Graphics hang and kernel backtrace when starting Azureus with Compiz enabled https://bugzilla.kernel.org/show_bug.cgi?id=15664 v2: Daniel Vetter reminded me that kernel space programming is never easy. We cannot simply spin to clear the pending signal and so must deferred the freeing of the object until later. v3: Run from the top level retire requests. v4: Tested with P(return -ERESTARTSYS)=.5 from i915_gem_do_wait_request() v5: Rebase against Eric's for-linus tree. v6: Refactor, split and add a comment about avoiding unbounded recursion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:53:24 -07:00
Chris Wilson	b09a1feca6	drm/i915: Refactor i915_gem_retire_requests() Combine the iteration over active render rings into a common function. This is in preparation for reusing the idle function to also retire deferred free requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:52:57 -07:00
Chris Wilson	2dafb1e082	drm/i915: Propagate error from i915_gem_object_flush_gpu_write_domain() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:44 -07:00
Chris Wilson	5f35308bab	drm/i915: Propagate error from drm_install_irq() during EnterVT Simple fix for error propagation along the old UMS path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:44 -07:00
Chris Wilson	43b27f40eb	drm/i915: Explosion following OOM in do_execbuffer. Oops, when merging the extra details following an OOM, I missed that driver_private is now NULL and the correct way to convert from the drm_gem_object into the drm_i915_gem_object is to use to_intel_bo(). BUG: unable to handle kernel NULL pointer dereference at 00000069 IP: [<c11a4a02>] i915_gem_do_execbuffer+0x71f/0xbb6 *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/vc/vcsa3/uevent Pid: 10993, comm: X Not tainted 2.6.35-rc2+ #67 / EIP: 0060:[<c11a4a02>] EFLAGS: 00213202 CPU: 0 EIP is at i915_gem_do_execbuffer+0x71f/0xbb6 EAX: f647e8a8 EBX: 00000000 ECX: 00000003 EDX: 00000000 ESI: 00424000 EDI: 00000000 EBP: f6508e48 ESP: f6508dd4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process X (pid: 10993, ti=f6508000 task=f6432880 task.ti=f6508000) Stack: f6508de0 f7130000 00000001 00000000 00000000 f647e8a8 00000000 f64f8480 <0> f7974414 00000000 00000006 00000000 00000000 f6578000 00000008 00000006 <0> f6797880 00400000 00000000 ffffffe4 f7974400 000000d0 000000d0 000001c0 Call Trace: [<c11a4f3a>] ? i915_gem_execbuffer2+0xa1/0xe7 [<c118ab96>] ? drm_ioctl+0x22c/0x2fa [<c11a4e99>] ? i915_gem_execbuffer2+0x0/0xe7 [<c107e88c>] ? do_sync_read+0x8f/0xca [<c1088cbd>] ? vfs_ioctl+0x2c/0x96 [<c118a96a>] ? drm_ioctl+0x0/0x2fa [<c10891f4>] ? do_vfs_ioctl+0x429/0x45a [<c107e5c9>] ? fsnotify_access+0x54/0x5f [<c107ee1c>] ? vfs_read+0x9a/0xae [<c1089258>] ? sys_ioctl+0x33/0x4d [<c1002610>] ? sysenter_do_call+0x12/0x26 Code: d0 89 4d c4 31 c9 89 45 d8 eb 44 8b 45 cc 8b 14 88 8b 42 50 89 45 bc 8b 45 a0 8b 52 38 89 55 d0 31 d2 f6 40 20 01 74 0d 8b 55 bc <f6> 42 69 30 0f 95 c2 0f b6 d2 8b 45 d0 c7 45 d4 00 00 00 00 89 EIP: [<c11a4a02>] i915_gem_do_execbuffer+0x71f/0xbb6 SS:ESP 0068:f6508dd4 CR2: 0000000000000069 ---[ end trace 3f1d514b34d39381 ]--- Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:42 -07:00
Dave Airlie	d656ae53f6	Merge tag 'v2.6.35-rc6' into drm-radeon-next Need this to avoid conflicts with future radeon fixes	2010-08-02 10:05:24 +10:00
Linus Torvalds	f4b23cc2d5	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/r600: fix possible NULL pointer derefernce drm/radeon/kms: add quirk for ASUS HD 3600 board include/linux/vgaarb.h: add missing part of include guard drm/nouveau: Fix crashes during fbcon init on single head cards. drm/nouveau: fix pcirom vbios shadow breakage from acpi rom patch drm/radeon/kms: fix shared ddc harder drm/i915: enable low power render writes on GEN3 hardware. drm/i915: Define MI_ARB_STATE bits vmwgfx: return -EFAULT if copy_to_user fails fb: handle allocation failure in alloc_apertures() drm: radeon: check kzalloc() result drm/ttm: Fix build on architectures without AGP drm/radeon/kms: fix gtt MC base alignment on rs4xx/rs690/rs740 asics drm/radeon/kms: fix possible mis-detection of sideport on rs690/rs740 drm/radeon/kms: fix legacy tv-out pal mode	2010-07-20 18:29:25 -07:00
Dave Airlie	944001201c	drm/i915: enable low power render writes on GEN3 hardware. A lot of 945GMs have had stability issues for a long time, this manifested as X hangs, blitter engine hangs, and lots of crashes. one such report is at: https://bugs.freedesktop.org/show_bug.cgi?id=20560 along with numerous distro bugzillas. This only took a week of digging and hair ripping to figure out. Tracked down and tested on a 945GM Lenovo T60, previously running x11perf -copypixwin500 or x11perf -copywinpix500 repeatedly would cause the GPU to wedge within 4 or 5 tries, with random busy bits set. After this patch no hangs were observed. cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-07-20 15:24:18 +10:00
Dave Chinner	7f8275d0d6	mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-07-19 14:56:17 +10:00
Linus Torvalds	cd9f040df6	drm/i915: add 'reclaimable' to i915 self-reclaimable page allocations The hibernate issues that got fixed in commit `985b823b91` ("drm/i915: fix hibernation since i915 self-reclaim fixes") turn out to have been incomplete. Vefa Bicakci tested lots of hibernate cycles, and without the __GFP_RECLAIMABLE flag the system eventually fails to resume. With the flag added, Vefa can apparently hibernate forever (or until he gets bored running his automated scripts, whichever comes first). The reclaimable flag was there originally, and was one of the flags that were dropped (unintentionally) by commit `4bdadb9785` ("drm/i915: Selectively enable self-reclaim") that introduced all these problems, but I didn't want to just blindly add back all the flags in commit `985b823b91`, and it looked like __GFP_RECLAIM wasn't necessary. It clearly was. I still suspect that there is some subtle reason we're missing that causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use in this context, and is what the code historically used. And we have no idea what the causes the corruption without it. Reported-and-tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-18 09:44:37 -07:00
Daniel Vetter	db3307a9f7	drm: kill drm_mm_node->private Only ever assigned, never used. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [glisse: I will re-add if needed for range-restricted allocations] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-07-07 12:26:44 +10:00
Linus Torvalds	985b823b91	drm/i915: fix hibernation since i915 self-reclaim fixes Since commit `4bdadb9785` ("drm/i915: Selectively enable self-reclaim"), we've been passing GFP_MOVABLE to the i915 page allocator where we weren't before due to some over-eager removal of the page mapping gfp_flags games the code used to play. This caused hibernate on Intel hardware to result in a lot of memory corruptions on resume. See for example http://bugzilla.kernel.org/show_bug.cgi?id=13811 Reported-by: Evengi Golov (in bugzilla) Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: stable@kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-01 18:37:01 -07:00
Chris Wilson	ab34c22681	drm/i915: Fix up address spaces in slow_kernel_write() Since we now get_user_pages() outside of the mutex prior to performing the copy, we kmap() the page inside the copy routine and so need to perform an ordinary memcpy() and not copy_from_user(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:03:29 -07:00
Chris Wilson	99a03df57c	drm/i915: Use non-atomic kmap for slow copy paths As we do not have a requirement to be atomic and avoid sleeping whilst performing the slow copy for shmem based pread and pwrite, we can use kmap instead, thus simplifying the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:02:36 -07:00
Chris Wilson	9b8c4a0b21	drm/i915: Avoid moving from CPU domain during pwrite We can avoid an early clflush when pwriting if we use the current CPU write domain rather than moving the object to the GTT domain for the purposes of the pwrite. This has the advantage of not flushing the presumably hot data that we want to upload into the bo, and of ascribing the clflush to the execution when profiling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:02:00 -07:00
Chris Wilson	68f95ba9e2	drm/i915: Cleanup after failed initialization of ringbuffers The callers expect us to cleanup any partially initialised structures before reporting the error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:01:02 -07:00
Chris Wilson	654fc6073f	drm/i915: Reject bind_to_gtt() early if object > aperture If the object is bigger than the entire aperture, reject it early before evicting everything in a vain attempt to find space. v2: Use E2BIG as suggested by Owain G. Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:52:15 -07:00
Chris Wilson	3d1cc47037	drm/i915: Remove spurious warning "Failure to install fence" This particular warning is harmless as we emit during the normal pinning process where the batch buffer requires more fences than is available without eviction. Only if we fail to evict enough fences does this become a problem, so include the requested number of fences in the ultimate error message. v2: Remember to compile test even trial patches to remove warnings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:49:38 -07:00
Chris Wilson	ac0c6b5ad3	drm/i915: Rebind bo if currently bound with incorrect alignment. Whilst pinning the buffer, check that that its current alignment matches the requested alignment. If it does not, rebind. This should clear up any final render errors whilst resuming, for reference: Bug 27070 - [i915] Page table errors with empty ringbuffer https://bugs.freedesktop.org/show_bug.cgi?id=27070 Bug 15502 - render error detected, EIR: 0x00000010 https://bugzilla.kernel.org/show_bug.cgi?id=15502 Bug 13844 - i915 error: "render error detected" https://bugzilla.kernel.org/show_bug.cgi?id=13844 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:43:38 -07:00
Chris Wilson	808b24d6ed	drm/i915: Propagate error from unbinding an unfenceable object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:42:52 -07:00
Chris Wilson	b118c1e363	drm/i915: Avoid nesting of domain changes when setting display plane Nesting domain changes will cause confusion when trying to interpret the tracepoints describing the sequence of changes for the object, as well as obscuring the order of operations for the reader of the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:42:04 -07:00
Daniel Vetter	778c35444f	drm/i915: combine all small integers into one single bitfield This saves a whooping 7 dwords. Zero functional changes. Because some of the refcounts are rather tightly calculated, I've put BUG_ONs in the code to check for overflows. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 14:13:36 -07:00
Zou Nan hai	d1b851fc0d	drm/i915: implement BSD ring buffer V2 The BSD (bit stream decoder) ring is used for accessing the BSD engine which decodes video bitstream for H.264 and VC1 on G45+. It is asynchronous with the render ring and has access to separate parts of the GPU from it, though the render cache is coherent between the two. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:46:58 -07:00
Zou Nan hai	852835f343	drm/i915: convert some gem structures to per-ring V2 The active list and request list move into the ringbuffer structure, so each can track its active objects in the order they are in that ring. The flushing list does not, as it doesn't matter which ring caused data to end up in the render cache. Objects gain a pointer to the ring they are active on (if any). Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:42:11 -07:00
Zou Nan hai	8187a2b70e	drm/i915: introduce intel_ring_buffer structure (V2) Introduces a more complete intel_ring_buffer structure with callbacks for setup and management of a particular ringbuffer, and converts the render ring buffer consumers to use it. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> [anholt: Fixed up whitespace fail and rebased against prep patches] Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:24:49 -07:00
Eric Anholt	d3301d86b4	drm/i915: Rename dev_priv->ring to dev_priv->render_ring. With the advent of the BSD ring, be clear about which ring this is. The docs are pretty consistent with calling this the Render engine at this point.	2010-05-26 12:36:00 -07:00
Eric Anholt	62fdfeaf8b	drm/i915: Move ringbuffer-related code to intel_ringbuffer.c. This is preparation for supporting multiple ringbuffers on Ironlake. The non-copy-and-paste changes are: - de-staticing functions - I915_GEM_GPU_DOMAINS moving to i915_drv.h to be used by both files. - i915_gem_add_request had only half its implementation copy-and-pasted out of the middle of it.	2010-05-26 12:36:00 -07:00
Daniel Vetter	007cc8ac4e	drm/i915: move fence lru to struct drm_i915_fence_reg This lru tracks fences, not objects, so move it to where it belongs. As a side effect, this nicely shrinks drm_i915_gem_object by two pointers. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-10 13:38:31 -07:00
Eric Anholt	34dc4d4423	Merge remote branch 'origin/master' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/radeon/r300.c The BSD ringbuffer support that is landing in this branch significantly conflicts with the Ironlake PIPE_CONTROL fix on master, and requires it to be tested successfully anyway.	2010-05-10 13:36:52 -07:00
Chris Wilson	1637ef413b	drm/i915: Wait for the GPU whilst shrinking, if truly desperate. By idling the GPU and discarding everything we can when under extreme memory pressure, the number of OOM-killer events is dramatically reduced. For instance, this makes it possible to run firefox-planet-gnome.trace again on my swapless 512MiB i915. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-07 13:59:28 -07:00
Jesse Barnes	1918ad77f7	drm/i915: fix non-Ironlake 965 class crashes My PIPE_CONTROL fix (just sent via Eric's tree) was buggy; I was testing a whole set of patches together and missed a conversion to the new HAS_PIPE_CONTROL macro, which will cause breakage on non-Ironlake 965 class chips. Fortunately, the fix is trivial and has been tested. Be sure to use the HAS_PIPE_CONTROL macro in i915_get_gem_seqno, or we'll end up reading the wrong graphics memory, likely causing hangs, crashes, or worse. Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com> Reported-by: Toralf Förster <toralf.foerster@gmx.de> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-23 10:39:20 -07:00
Jesse Barnes	e552eb7038	drm/i915: use PIPE_CONTROL instruction on Ironlake and Sandy Bridge Since 965, the hardware has supported the PIPE_CONTROL command, which provides fine grained GPU cache flushing control. On recent chipsets, this instruction is required for reliable interrupt and sequence number reporting in the driver. So add support for this instruction, including workarounds, on Ironlake and Sandy Bridge hardware. https://bugs.freedesktop.org/show_bug.cgi?id=27108 Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-04-22 14:48:55 -07:00
Daniel Vetter	a8089e849a	drm/i915: drop pointer to drm_gem_object Luckily the change is quite a little bit less invasive than I've feared. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:23:14 +10:00
Daniel Vetter	62b8b21515	drm/i915: don't use ->driver_private anymore Thanks to the to_intel_bo helper, this change is rather trivial. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:56 +10:00
Daniel Vetter	c397b9084c	drm/i915: embed the gem object into drm_i915_gem_object Just embed it and adjust the pointers, No other changes (that's for later patches). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:45 +10:00
Daniel Vetter	ac52bc56de	drm/i915: introduce i915_gem_alloc_object Just preparation, no functional change. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:26 +10:00
Daniel Vetter	fd632aa34c	drm: free core gem object from driver callbacks When drivers embed the core gem object into their own structures, they'll have to do this. Temporarily this results in an ugly kfree(gem_obj); in every gem driver. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:19:33 +10:00
Daniel Vetter	c36a2a6de5	drm/i915: fix tiling limits for i915 class hw v2 Current code is definitely crap: Largest pitch allowed spills into the TILING_Y bit of the fence registers ... :( I've rewritten the limits check under the assumption that 3rd gen hw has a 3d pitch limit of 8kb (like 2nd gen). This is supported by an otherwise totally misleading XXX comment. This bug mostly resulted in tiling-corrupted pixmaps because the kernel allowed too wide buffers to be tiled. Bug brought to the light by the xf86-video-intel 2.11 release because that unconditionally enabled tiling for pixmaps, relying on the kernel to check things. Tiling for the framebuffer was not affected because the ddx does some additional checks there ensure the buffer is within hw-limits. v2: Instead of computing the value that would be written into the hw fence registers and then checking the limits simply check whether the stride is above the 8kb limit. To better document the hw, add some WARN_ONs in i915_write_fence_reg like I've done for the i830 case (using the right limits). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27449 Tested-by: Alexander Lam <lambchop468@gmail.com> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-04-18 17:58:24 -07:00
Linus Torvalds	13bd8e4673	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: drm/i915: Ignore LVDS EDID when it is unavailabe or invalid drm/i915: Add no_lvds entry for the Clientron U800 drm/i915: Rename many remaining uses of "output" to encoder or connector. drm/i915: Rename intel_output to intel_encoder. agp/intel: intel_845_driver is an agp driver! drm/i915: introduce to_intel_bo helper drm/i915: Disable FBC on 915GM and 945GM.	2010-04-17 14:28:50 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Daniel Vetter	23010e43b3	drm/i915: introduce to_intel_bo helper This is a purely cosmetic change to make changes in this area easier. And hey, it's not only clearer and typechecked, but actually shorter, too! [anholt: To clarify, this is a change to let us later make drm_i915_gem_object subclass drm_gem_object, instead of having drm_gem_object have a pointer to i915's private data] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-25 11:06:17 -07:00
Chris Wilson	1f2b10131f	drm/i915: Avoid NULL deref in get_pages() unwind after error. Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=15527 NULL pointer dereference in i915_gem_object_save_bit_17_swizzle BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<f82b5d2b>] i915_gem_object_save_bit_17_swizzle+0x5b/0xc0 [i915] Call Trace: [<f82aea55>] ? i915_gem_object_put_pages+0x125/0x150 [i915] [<f82aeb71>] ? i915_gem_object_get_pages+0xf1/0x110 [i915] [<f82b0de8>] ? i915_gem_object_bind_to_gtt+0xb8/0x2a0 [i915] [<c02db74d>] ? drm_mm_get_block_generic+0x4d/0x180 [<f82b11cd>] ? i915_gem_mmap_gtt_ioctl+0x16d/0x240 [i915] [<f82ae786>] ? i915_gem_madvise_ioctl+0x86/0x120 [i915] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: maciej.rutecki@gmail.com Cc: stable@kernel.org Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 13:17:24 -07:00
Eric Anholt	71cf39b117	drm/i915: Enable VS timer dispatch. This could resolve HW deadlocks where a unit downstream of the VS is waiting for more input, the VS has one vertex queued up but not dispatched because it hopes to get one more vertex for 2x4 dispatch, and software isn't handing more vertices down because it's waiting for rendering to complete. The B-Spec says you should always have this bit set. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 12:59:32 -07:00
Owain G. Ainsworth	5d9391628e	drm/i915: remove an unnecessary wait_request() The continue just after this call with loop around and wait for the request just added just fine. This leads to slightly more compact code. Signed-Off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 12:59:30 -07:00
Daniel Vetter	16edd55029	drm/i915: check for multiple write domains in pin_and_relocate The assumption that an object has only ever one write domain is deeply threaded into gem (it's even encoded the the singular of the variable name). Don't let userspace screw us over. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:22 -08:00
Daniel Vetter	922a2efc1b	drm/i915: clean-up i915_gem_flush_gpu_write_domain Now that we have an exact gpu write domain tracking, we don't need to move objects to the active list ourself. i915_add_request will take care of that under all circumstances. Idea stolen from a patch by Chris Wilson <chris@chris-wilson.co.uk>. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:22 -08:00
Daniel Vetter	4df2faf451	drm/i915: reuse i915_gpu_idle helper We have it, so use it. This required moving the function to avoid a forward declaration. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	6356039653	drm/i915: ensure lru ordering of fence_list The fence_list should be lru ordered for otherwise we might try to steal a fence reg from an active object even though there are fences from inactive objects available. lru ordering was obeyed for gpu access everywhere save when moving dirty objects from flushing_list to active_list. Fixing this cause the code to indent way to much, so I've extracted the flushing_list processing logic into its on function. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	ae3db24aab	drm/i915: extract fence stealing code The spaghetti logic in there tripped up my brain's code parser for a few secs. Prevent this from happening again by extracting the fence stealing code into a seperate functions. IMHO this slightly clears up the code flow. v2: Beautified according to ickle's comments. v3: ickle forgot to flush his comment queue ... Now there's also a we-are-paranoid BUG_ON in there. v4: I've forgotten to switch on my brain when doing v3. Now the BUG_ON actually checks something useful. v5: Clean up a stale comment as noted by Eric Anholt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	4a87b8ca21	drm/i915: fixup active list locking in object_unbind All other accesses take this spinlock, so do this here, too. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	798750e30d	drm/i915: reuse i915_gem_object_put_fence_reg for fence stealing code This has a few functional changes against the old code: * a few more unnecessary loads and stores to the drm_i915_fence_reg objects. Also an unnecessary store to the hw fence register. * zaps any userspace mappings before doing other flushes. Only changes anything when userspace does racy stuff against itself. * also flush GTT domain. This is a noop, but still try to keep the bookkeeping correct. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Eric Anholt	f6e450a641	drm/i915: Fix sandybridge status page setup. The register's moved to the same location as the one for the BCS, it seems. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:18 -08:00
Eric Anholt	4e901fdc26	drm/i915: Set up fence registers on sandybridge. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:18 -08:00
Eric Anholt	bad720ff3e	drm/i915: Add initial bits for VGA modesetting bringup on Sandybridge. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:17 -08:00
Dave Airlie	30d6c72c4a	Merge remote branch 'anholt/drm-intel-next' into drm-next-stage * anholt/drm-intel-next: drm/i915: Record batch buffer following GPU error drm/i915: give up on 8xx lid status drm/i915: reduce some of the duplication of tiling checking drm/i915: blow away userspace mappings before fence change drm/i915: move a gtt flush to the correct place agp/intel: official names for Pineview and Ironlake drm/i915: overlay: drop superflous gpu flushes drm/i915: overlay: nuke readback to flush wc caches drm/i915: provide self-refresh status in debugfs drm/i915: provide FBC status in debugfs drm/i915: fix drps disable so unload & re-load works drm/i915: Fix OGLC performance regression on 945 drm/i915: Deobfuscate the render p-state obfuscation drm/i915: add dynamic performance control support for Ironlake drm/i915: enable memory self refresh on 9xx drm/i915: Don't reserve compatibility fence regs in KMS mode. drm/i915: Keep MCHBAR always enabled drm/i915: Replace open-coded eviction in i915_gem_idle()	2010-02-25 13:39:36 +10:00
Dave Airlie	de19322d55	Merge remote branch 'korg/drm-core-next' into drm-next-stage * korg/drm-core-next: drm/ttm: handle OOM in ttm_tt_swapout drm/radeon/kms/atom: fix shr/shl ops drm/kms: fix spelling of "CLOCK" drm/kms: fix fb_changed = true else statement drivers/gpu/drm/drm_fb_helper.c: don't use private implementation of atoi() drm: switch all GEM/KMS ioctls to unlocked ioctl status. Use drm_gem_object_[handle_]unreference_unlocked where possible drm: introduce drm_gem_object_[handle_]unreference_unlocked	2010-02-25 13:39:29 +10:00
Owain Ainsworth	f590d279eb	drm/i915: reduce some of the duplication of tiling checking i915_gem_object_fenceable was mostly just a repeat of the i915_gem_object_fence_offset_ok, but also checking the size (which was checkecd when we allowed that BO to be tiled in the first place). So instead, export the latter function and use it in place. Signed-Off-By: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:42 -05:00
Daniel Vetter	10ae9bd25a	drm/i915: blow away userspace mappings before fence change This aligns it with the other user of i915_gem_clear_fence_reg, which blows away the mapping before changing the fence reg. Only affects userspace if it races against itself when changing tiling parameters, i.e. behaviour is undefined, anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:42 -05:00
Daniel Vetter	4a7266123f	drm/i915: move a gtt flush to the correct place No functional change, because gtt flushing is a no-op. Still, try to keep the bookkeeping accurate. The if is still slightly wrong for with execbuf2 even i915-class hw doesn't always need a fence reg for gpu access. But that's for somewhen lateron. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:41 -05:00
Eric Anholt	b397c836ef	drm/i915: Don't reserve compatibility fence regs in KMS mode. The fence start is for compatibility with UMS X Servers before fence management. KMS X Servers only started doing tiling after fence management appeared. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-16 11:48:44 -08:00
Chris Wilson	29105ccc43	drm/i915: Replace open-coded eviction in i915_gem_idle() With the introduction of the hang-check, we can safely expect that i915_wait_request() will always return even when the GPU hangs, and so do not need to open code the wait in order to manually check for the hang. Also we do not need to always evict all buffers, so only flush the GPU (and wait for it to idle) for KMS, but continue to evict for UMS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-16 11:48:43 -08:00
Luca Barbieri	bc9025bdc4	Use drm_gem_object_[handle_]unreference_unlocked where possible Mostly obvious simplifications. The i915 pread/pwrite ioctls, intel_overlay_put_image and nouveau_gem_new were incorrectly using the locked versions without locking: this is also fixed in this patch. Signed-off-by: Luca Barbieri <luca@luca-barbieri.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-02-11 14:22:34 +10:00
Owain Ainsworth	a40e8d3139	drm/i915: Correctly return -ENOMEM on allocation failure in cmdbuf ioctls. Signed-off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-10 15:08:00 -08:00
Daniel Vetter	99fcb766a3	drm/i915: Update write_domains on active list after flush. Before changing the status of a buffer with a pending write we will await upon a new flush for that buffer. So we can take advantage of any flushes posted whilst the buffer is active and pending processing by the GPU, by clearing its write_domain and updating its last_rendering_seqno -- thus saving a potential flush in deep queues and improves flushing behaviour upon eviction for both GTT space and fences. In order to reduce the time spent searching the active list for matching write_domains, we move those to a separate list whose elements are the buffers belong to the active/flushing list with pending writes. Orignal patch by Chris Wilson <chris@chris-wilson.co.uk>, forward-ported by me. In addition to better performance, this also fixes a real bug. Before this changes, i915_gem_evict_everything didn't work as advertised. When the gpu was actually busy and processing request, the flush and subsequent wait would not move active and dirty buffers to the inactive list, but just to the flushing list. Which triggered the BUG_ON at the end of this function. With the more tight dirty buffer tracking, all currently busy and dirty buffers get moved to the inactive list by one i915_gem_flush operation. I've left the BUG_ON I've used to prove this in there. References: Bug 25911 - 2.10.0 causes kernel oops and system hangs http://bugs.freedesktop.org/show_bug.cgi?id=25911 Bug 26101 - [i915] xf86-video-intel 2.10.0 (and git) triggers kernel oops within seconds after login http://bugs.freedesktop.org/show_bug.cgi?id=26101 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Adam Lantos <hege@playma.org> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-10 13:31:45 -08:00
Linus Torvalds	f6510ec5a9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: drm/i915: Fix leak of relocs along do_execbuffer error path drm/i915: slow acpi_lid_open() causes flickering - V2 drm/i915: Disable SR when more than one pipe is enabled drm/i915: page flip support for Ironlake drm/i915: Fix the incorrect DMI string for Samsung SX20S laptop drm/i915: Add support for SDVO composite TV drm/i915: don't trigger ironlake vblank interrupt at irq install drm/i915: handle non-flip pending case when unpinning the scanout buffer drm/i915: Fix the device info of Pineview drm/i915: enable vblank interrupt on ironlake drm/i915: Prevent use of uninitialized pointers along error path. drm/i915: disable hotplug detect before Ironlake CRT detect	2010-02-06 13:01:39 -08:00
Chris Wilson	93533c291a	drm/i915: Fix leak of relocs along do_execbuffer error path Following a gpu hang, we would leak the relocation buffer. So simply earrange the error path to always free the relocation buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-04 09:04:01 -08:00
Chris Wilson	4bdadb9785	drm/i915: Selectively enable self-reclaim Having missed the ENOMEM return via i915_gem_fault(), there are probably other paths that I also missed. By not enabling NORETRY by default these paths can run the shrinker and take memory from the system (but not from our own inactive lists because our shrinker can not run whilst we hold the struct mutex) and this may allow the system to survive a little longer whilst our drivers consume all available memory. References: OOM killer unexpectedly called with kernel 2.6.32 http://bugzilla.kernel.org/show_bug.cgi?id=14933 v2: Pass gfp into page mapping. v3: Use new read_cache_page_gfp() instead of open-coding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-27 09:26:43 -08:00
Chris Wilson	0ce907f891	drm/i915: Prevent use of uninitialized pointers along error path. X.org hang with [drm:i915_gem_do_execbuffer] ERROR in dmesg http://bugzilla.kernel.org/show_bug.cgi?id=15114 Matej found he was hitting an error path within i915_gem_do_execbuffer() that led to the attempt to dereference an uninitialised pointer during cleanup. This path used to be safe as we used to calloc the object lists, but this was changed in `c8e0f93`. Daniel Vetter had also spotted this error and proposed a similar patch. [ 6379.732892] [drm:i915_gem_do_execbuffer] ERROR Object ffff880098cd6540 appears more than once in object list [ 6379.740976] [drm:i915_gem_do_execbuffer] ERROR Object ffff880098cd6540 appears more than once in object list [ 6379.740995] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 [ 6379.740998] IP: [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741006] PGD babab067 PUD bb435067 PMD 0 [ 6379.741010] Oops: 0002 [#1] PREEMPT SMP [ 6379.741014] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:06:00.0/ieee80211/phy0/rfkill0/state [ 6379.741017] CPU 1 [ 6379.741021] Pid: 2186, comm: X Not tainted 2.6.33-rc4-00399-g24bc734 #142 M11D/ESPRIMO Mobile M9400 [ 6379.741023] RIP: 0010:[<ffffffff8122ddb5>] [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741027] RSP: 0018:ffff8800b9047b78 EFLAGS: 00213206 [ 6379.741029] RAX: 0000000000000000 RBX: 000000000000004f RCX: ffff880098cac800 [ 6379.741032] RDX: ffff880098caca78 RSI: ffff8800b9047c98 RDI: ffff880098cd6540 [ 6379.741034] RBP: ffff8800b9047c78 R08: ffffffff814b96b5 R09: 0000000000000006 [ 6379.741036] R10: 0000000000000000 R11: 0000000000000003 R12: 000000000000004e [ 6379.741038] R13: 00000000fffffff7 R14: 0000000000000000 R15: 0000000000000001 [ 6379.741041] FS: 0000000000000000(0000) GS:ffff880001900000(0063) knlGS:00000000f72636c0 [ 6379.741043] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 6379.741041] FS: 0000000000000000(0000) GS:ffff880001900000(0063) knlGS:00000000f72636c0 [ 6379.741043] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 6379.741045] CR2: 00000000000000a0 CR3: 00000000b9000000 CR4: 00000000000006e0 [ 6379.741048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6379.741050] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 6379.741052] Process X (pid: 2186, threadinfo ffff8800b9046000, task ffff8800bb5d8000) [ 6379.741054] Stack: [ 6379.741055] ffffc90023f57000 ffffc90023f56fff ffffc90023f56fff ffffc90023f55000 [ 6379.741059] <0> ffff8800b9047c98 ffff8800bb43c840 ffff8800bf1de800 ffff8800bf1de820 [ 6379.741063] <0> ffff8800b9047bd8 ffff880098cac800 0000000000000000 0000000000000002 [ 6379.741068] Call Trace: [ 6379.741072] [<ffffffff8122e6cb>] ? i915_gem_execbuffer+0x6b/0x370 [ 6379.741077] [<ffffffff810a5f52>] ? __vmalloc_node+0xa2/0xb0 [ 6379.741080] [<ffffffff8122e6cb>] ? i915_gem_execbuffer+0x6b/0x370 [ 6379.741083] [<ffffffff8122e816>] i915_gem_execbuffer+0x1b6/0x370 [ 6379.741086] [<ffffffff8120cd55>] drm_ioctl+0x1d5/0x460 [ 6379.741089] [<ffffffff8122e660>] ? i915_gem_execbuffer+0x0/0x370 [ 6379.741093] [<ffffffff81248c35>] i915_compat_ioctl+0x45/0x50 [ 6379.741097] [<ffffffff810f1659>] compat_sys_ioctl+0xa9/0x1570 [ 6379.741102] [<ffffffff810b1d5c>] ? vfs_read+0x13c/0x1a0 [ 6379.741106] [<ffffffff81028424>] sysenter_dispatch+0x7/0x2b [ 6379.741108] Code: 08 85 c0 74 52 31 db 0f 1f 80 00 00 00 00 48 63 c3 48 8b 8d 68 ff ff ff 48 8d 14 c1 48 8b 02 48 85 c0 74 25 48 8b 80 80 00 00 00 <c7> 80 a0 00 00 00 00 00 00 00 48 8b 3a 48 85 ff 74 0c 48 c7 c6 [ 6379.741142] RIP [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741145] RSP <ffff8800b9047b78> [ 6379.741147] CR2: 00000000000000a0 [ 6379.741159] ---[ end trace 0598809afa4c31db ]--- Reported-by: Matej Laitl <strohel@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-25 09:00:38 -08:00
Eric Anholt	6036ae7e94	drm/i915: Remove chatty execbuf failure message. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> (in principle) Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-15 13:05:36 -08:00
Zhenyu Wang	b9241ea31f	drm/i915: Don't wait interruptible for possible plane buffer flush When we setup buffer for display plane, we'll check any pending required GPU flush and possible make interruptible wait for flush complete. But that wait would be most possibly to fail in case of signals received for X process, which will then fail modeset process and put display engine in unconsistent state. The result could be blank screen or CPU hang, and DDX driver would always turn on outputs DPMS after whatever modeset fails or not. So this one creates new helper for setup display plane buffer, and when needing flush using uninterruptible wait for that. This one should fix bug like https://bugs.freedesktop.org/show_bug.cgi?id=24009. Also fixing mode switch stress test on Ironlake. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-12 15:07:34 -08:00
Linus Torvalds	2c1f1895ef	Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: rs600: use correct mask for SW interrupt gpu/drm/radeon/radeon_irq.c: move a dereference below a NULL test drm/radeon/radeon_device.c: move a dereference below a NULL test drm/radeon/radeon_fence.c: move a dereference below the NULL test drm/radeon/radeon_connectors.c: add a NULL test before dereference drm/radeon/kms: fix memory leak drm/kms: Fix &&/\|\| confusion in drm_fb_helper_connector_parse_command_line() drm/edid: Fix CVT width/height decode drm/edid: Skip empty CVT codepoints drm: remove address mask param for drm_pci_alloc() drm/radeon/kms: add missing breaks in i2c and ss lookups drm/radeon/kms: add primary dac adj values table drm/radeon/kms: fallback to default connector table	2010-01-06 20:26:42 -08:00
Zhenyu Wang	e6be8d9d17	drm: remove address mask param for drm_pci_alloc() drm_pci_alloc() has input of address mask for setting pci dma mask on the device, which should be properly setup by drm driver. And leave it as a param for drm_pci_alloc() would cause confusion or mistake would corrupt the correct dma mask setting, as seen on intel hw which set wrong dma mask for hw status page. So remove it from drm_pci_alloc() function. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-01-07 13:15:50 +10:00
Chris Wilson	e3d8affb0d	drm/i915: Permit pinning whilst the device is 'suspended' As pinning (allocating and binding GTT memory) does not actually invoke GPU commands, it is safe, and indeed is attempted, during resumption from suspension: [drm:intel_init_clock_gating] ERROR failed to pin power context: -16 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-06 09:40:11 -08:00
Jesse Barnes	76446cac68	drm/i915: execbuf2 support This patch adds a new execbuf ioctl, execbuf2, for use by clients that want to control fence register allocation more finely. The buffer passed in to the new ioctl includes a new relocation type to indicate whether a given object needs a fence register assigned for the command buffer in question. Compatibility with the existing execbuf ioctl is implemented in terms of the new code, preserving the assumption that fence registers are required for pre-965 rendering commands. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> [ickle: Remove pre-emptive clear_fence_reg()] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> [anholt: Removed dmesg spam] Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-06 09:39:39 -08:00
Daniel Vetter	96b47b6559	drm/i915: fix order of fence release wrt flushing i915_gem_object_unbind had the ordering wrong. The other user, i915_gem_object_put_fence_reg already has the correct ordering. Results was usually corrupted pixmaps, especially garbled font glyphs after a suspend/resume (because this evicts everything). I'm still waiting for the feedback from the bug-reporters, but because this obviously fixes a bug (at least for me) I'm already submitting it. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=25406 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net> CC: stable@kernel.org	2009-12-16 09:18:37 -08:00
Linus Torvalds	3ef884b4c0	Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (189 commits) drm/radeon/kms: fix warning about cur_placement being uninitialised. drm/ttm: Print debug information on memory manager when eviction fails drm: Add memory manager debug function drm/radeon/kms: restore surface registers on resume. drm/radeon/kms/r600/r700: fallback gracefully on ucode failure drm/ttm: Initialize eviction placement in case the driver callback doesn't drm/radeon/kms: cleanup structure and module if initialization fails drm/radeon/kms: actualy set the eviction placements we choose drm/radeon/kms: Fix NULL ptr dereference drm/radeon/kms/avivo: add support for new pll selection algo drm/radeon/kms/avivo: fix some bugs in the display bandwidth setup drm/radeon/kms: fix return value from fence function. drm/radeon: Remove tests for -ERESTART from the TTM code. drm/ttm: Have the TTM code return -ERESTARTSYS instead of -ERESTART. drm/radeon/kms: Convert radeon to new TTM validation API (V2) drm/ttm: Rework validation & memory space allocation (V3) drm: Add search/get functions to get a block in a specific range drm/radeon/kms: fix avivo tiling regression since radeon object rework drm/i915: Remove a debugging printk from hangcheck drm/radeon/kms: make sure i2c id matches ...	2009-12-10 21:56:47 -08:00
Chris Wilson	5618ca6abc	drm/i915: Set the error code after failing to insert new offset into mm ht. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 15:44:30 -08:00
Adam Jackson	f2b115e69d	drm/i915: Fix product names and #defines IGD* isn't a useful name. Replace with the codenames, as sourced from pci.ids. Signed-off-by: Adam Jackson <ajax@redhat.com> [anholt: Fixed up for merge with pineview/ironlake changes] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 14:55:56 -08:00
Chris Wilson	ffb4728095	drm/i915: Drop a some common DRM_ERROR() These are handled by the error return being propagated to user-space and do not any add any information to the original error, so are useless. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 12:18:28 -08:00
André Goddard Rosa	af901ca181	tree-wide: fix assorted typos all over the place That is "success", "unknown", "through", "performance", "[re\|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over\|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-12-04 15:39:55 +01:00
Kristian Høgsberg	6b95a207c1	drm/i915: Add intel implementation of the pageflip ioctl Acked-by: Jakob Bornecrantz <jakob@vmware.com> Acked-by: Thomas Hellström <thomas@shipmail.org> Review-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse "Orange Smoothie" Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-01 09:10:35 -08:00
Eric Anholt	c8e0f93a38	drm/i915: Replace a calloc followed by copying data over it with malloc. Execbufs involve quite a bit of payload, to the extent that cache misses show up in the profiles here, and a suspicion that some of those cachelines may get evicted and then reloaded in the subsequent copy. This is still abstracted like drm_calloc_large since we want to check for size overflow, and because we want to choose between kmalloc and vmalloc on the fly. cairo's interface for malloc-with-calloc's-args was used as the model. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-25 06:36:21 -08:00
Zhao Yakui	44d98a6142	drm/i915: Replace DRM_DEBUG with DRM_DEBUG_DRIVER Replace the DRM_DEBUG with DRM_DEBUG_DRIVER in generic i915 driver. Then the debug info can be obtained by adding the boot option of "drm.debug=0x02". At the same time the debug info in increase/decrease clock is also printed by using DRM_DEBUG_DRIVER instead of DRM_DEBUG_KMS. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:10 -08:00
Daniel Vetter	1df4b35b61	drm/i915: kill i915_lp_ring_sync It's not needed anymore. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:09 -08:00
Daniel Vetter	5a5a0c64a9	drm/i915: implement fastpath for overlay flip waiting As long as the gpu can keep up, neither the cpu (waiting for gpu) nore the gpu (waiting for vblank to do an overlay flip) stalls. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:09 -08:00
Daniel Vetter	48764bf43f	drm/i915: add i915_lp_ring_sync helper This just waits until the hw passed the current ring position with cmd execution. This slightly changes the existing i915_wait_request function to make uninterruptible waiting possible - no point in returning to userspace while mucking around with the overlay, that piece of hw is just too fragile. Also replace a magic 0 with the symbolic constant (and kill the then superflous comment) while I was looking at the code. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:07 -08:00
Chris Wilson	9d34e5db07	drm/i915: Enable irq to trace batch buffer completion. If we trigger a tracepoint for batch buffer submission, it is a reasonable assumption that we wish to also trace the batch buffer completion. So in order to capture the completion events, we need to enable irqs... However, we cannot rely on the completion event to disable the irq later, so we defer the irq disable to the retire request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:25 +01:00
Chris Wilson	8f0dc5bf17	drm/i915: batch submit seqno off-by-one. We increment the seqno number between submitting the batch buffer and the flush/interrupt that demarcates its end, so the tracepoint needs to reference the incremented value to match the completion event. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:24 +01:00
Linus Torvalds	94e0fb086f	Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (57 commits) drm/i915: Handle ERESTARTSYS during page fault drm/i915: Warn before mmaping a purgeable buffer. drm/i915: Track purged state. drm/i915: Remove eviction debug spam drm/i915: Immediately discard any backing storage for uneeded objects drm/i915: Do not mis-classify clean objects as purgeable drm/i915: Whitespace correction for madv drm/i915: BUG_ON page refleak during unbind drm/i915: Search harder for a reusable object drm/i915: Clean up evict from list. drm/i915: Add tracepoints drm/i915: framebuffer compression for GM45+ drm/i915: split display functions by chip type drm/i915: Skip the sanity checks if the current relocation is valid drm/i915: Check that the relocation points to within the target drm/i915: correct FBC update when pipe base update occurs drm/i915: blacklist Acer AspireOne lid status ACPI: make ACPI button funcs no-ops if not built in drm/i915: prevent FIFO calculation overflows on 32 bits with high dotclocks drm/i915: intel_display.c handle latency variable efficiently ... Fix up trivial conflicts in drivers/gpu/drm/i915/{i915_dma.c\|i915_drv.h}	2009-09-24 10:30:41 -07:00
Chris Wilson	c715089f49	drm/i915: Handle ERESTARTSYS during page fault During a page fault and rebinding the buffer there exists a window for a signal to arrive during the i915_wait_request() and trigger a ERESTARTSYS. This used to be handled by returning SIGBUS and thereby killing the application. Try 'cairo-perf-trace & cairo-test-suite' and watch X go boom! The solution as suggested by H. Peter Anvin is to simply return NOPAGE and leave the higher layers to spot we did not fill the page and resubmit the page fault. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org [anholt: Mostly squash it with another commit]	2009-09-22 18:25:32 -07:00

... 9 10 11 12 13 ...

1197 Commits