linux

Author	SHA1	Message	Date
Daniel Vetter	c6642782b9	drm/i915: Add a mechanism for pipelining fence register updates Not employed just yet... Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:01:39 +00:00
Chris Wilson	caea7476d4	drm/i915: More accurately track last fence usage by the GPU Based on a patch by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:52 +00:00
Chris Wilson	a7a09aebe8	drm/i915: Rework execbuffer pinning Avoid evicting buffers that will be used later in the batch in order to make room for the initial buffers by pinning all bound buffers in a single pass before binding (and evicting for) fresh buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:51 +00:00
Chris Wilson	919926aeb3	drm/i915: Thread the pipelining ring through the callers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:16 +00:00
Chris Wilson	dddbc0e525	drm/i915: Remove a defunct BUG_ON This used to check the precondition that all fences were to be located in a mappable area, redundant now as those two parameters are combined into one. After pinning, we assert that the buffer is bound into the desired region. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:15 +00:00
Chris Wilson	b6913e4bdb	drm/i915: Move the implementation details of PIPE_CONTROL to the ringbuffer The pipe control object is allocated by the device for the sole use of the render ringbuffer. Move this detail from the general code to the render ring buffer initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:14 +00:00
Chris Wilson	92b88aeb1a	drm/i915: Not all mappable regions require GTT fence regions Combining map_and_fenceable revealed a bug in i915_gem_object_gtt_size() in that it always computed the appropriate fence size for the object regardless of tiling state which caused us to over-allocate linear buffers when binding to the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:13 +00:00
Chris Wilson	05394f3975	drm/i915: Use drm_i915_gem_object as the preferred type A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and many characters! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:10 +00:00
Daniel Vetter	7c2e6fdf45	drm/i915: move gtt handling to i915_gem_gtt.c No more drm_*_agp in i915_gem.c! Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:47 +00:00
Daniel Vetter	93a37f20ea	drm/i915: track objects in the gtt This is required to restore gtt mappings on resume when agp is gone. The right way to do this would be to make sturct drm_mm_node embeddable and use the allocation list maintained by the drm memory manager. But that's a bigger project. Getting rid of the per bo agp_mem will save more memory than this wastes, anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:45 +00:00
Daniel Vetter	40ce657510	drm/i915/gtt: call chipset flush directly Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:44 +00:00
Daniel Vetter	23ed992a5e	drm/i915\|intel-gtt: consolidate intel-gtt.h headers ... and a few other defines. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:43 +00:00
Chris Wilson	e384eafc1c	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-11-23 20:13:13 +00:00
Chris Wilson	bcf50e2775	drm/i915: Handle pagefaults in execbuffer user relocations Currently if we hit a pagefault when applying a user relocation for the execbuffer, we bail and return EFAULT to the application. Instead, we need to unwind, drop the dev->struct_mutex, copy all the relocation entries to a vmalloc array (to avoid any potential circular deadlocks when resolving the pagefault), retake the mutex and then apply the relocations. Afterwards, we need to again drop the lock and copy the vmalloc array back to userspace. v2: Incorporate feedback from Daniel Vetter. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-23 20:11:43 +00:00
Chris Wilson	e624ae8e0d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c	2010-11-22 08:51:36 +00:00
Chris Wilson	d1d788302e	drm/i915: Prevent integer overflow when validating the execbuffer Commit `2549d6c2` removed the vmalloc used for temporary storage of the relocation lists used during execbuffer. However, our use of vmalloc was being protected by an integer overflow check which we do want to preserve! Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-21 09:30:58 +00:00
Chris Wilson	51311d0a5c	drm/i915: Do not hold mutex when faulting in user addresses Linus Torvalds found that it was rather trivial to trigger a system freeze: In fact, with lockdep, I don't even need to do the sysrq-d thing: it shows the bug as it happens. It's the X server taking the same lock recursively. Here's the problem: ============================================= [ INFO: possible recursive locking detected ] 2.6.37-rc2-00012-gbdbd01a #7 --------------------------------------------- Xorg/2816 is trying to acquire lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c626c>] i915_gem_fault+0x50/0x17e but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a other info that might help us debug this: 2 locks held by Xorg/2816: #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a #1: (&mm->mmap_sem){++++++}, at: [<ffffffff81022d4f>] page_fault+0x156/0x37b This recursion was introduced by rearranging the locking to avoid the double locking on the fast path (4f27b5d and `fbd5a26d`) and the introduction of the prefault to encourage the fast paths (b5e4f2b). In order to undo the problem, we rearrange the code to perform the access validation upfront, attempt to prefault and then fight for control of the mutex. the best case scenario where the mutex is uncontended the prefaulting is not wasted. Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-19 09:30:15 +00:00
Chris Wilson	c94f28c383	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_ringbuffer.c	2010-11-15 06:49:30 +00:00
Chris Wilson	1bb95834bb	Merge remote branch 'airlied/drm-fixes' into drm-intel-fixes	2010-11-15 06:33:11 +00:00
Daniel Vetter	5e78330126	drm/i915: fix relaxed tiling for gen <= 3 && !g33 g33/pineview doesn't have any alignment constrains for unfenced tiled buffers. But older chips have. Fix this. Problem introduced in `a00b10c360`. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-15 05:22:16 +00:00
Chris Wilson	85345517fe	drm/i915: Retire any pending operations on the old scanout when switching An old and oft reported bug, is that of the GPU hanging on a MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is waiting on a scanline counter on an inactive pipe, and so waits for a very long time until eventually the user reboots his machine. We can prevent this either by moving the WAIT into the kernel and thereby incurring considerable cost on every swapbuffers, or by waiting for the GPU to retire the last batch that accesses the framebuffer before installing a new one. As mode switches are much rarer than swap buffers, this looks like an easy choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-13 09:49:11 +00:00
Chris Wilson	5d97eb69bd	drm/i915: Only add the lazy request if we end up waiting for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-10 20:41:16 +00:00
Joe Perches	fce7d61be0	drivers/gpu/drm: Update WARN uses Coalesce long formats. Align arguments. Add missing newlines. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-11-09 13:37:15 +10:00
Chris Wilson	b47b30ccda	drm/i915: Avoid might_fault during pwrite whilst holding our mutex ... and so prevent a potential circular reference: [ INFO: possible circular locking dependency detected ] 2.6.37-rc1-uwe1+ #4 ------------------------------------------------------- Xorg/1401 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0 but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>] i915_mutex_lock_interruptible+0x3c/0x60 [i915] which lock already depends on the new lock. When the locking around the pwrite ioctl was simplified, I did not spot that the phys path never took any locks and so we introduced this potential circular reference. Reported-by: Uwe Helm <uwe.helm@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-08 09:19:11 +00:00
Chris Wilson	045e769ab6	drm/i915: Handle GPU hangs during fault gracefully. Instead of killing the process, just return no page found and reschedule the process giving the GPU some time to (hopefully) recover. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-07 09:18:22 +00:00
Daniel Vetter	75e9e9158f	drm/i915: kill mappable/fenceable disdinction `a00b10c360` "Only enforce fence limits inside the GTT" also added a fenceable/mappable disdinction when binding/pinning buffers. This only complicates the code with no pratical gain: - In execbuffer this matters on for g33/pineview, as this is the only chip that needs fences and has an unmappable gtt area. But fences are only possible in the mappable part of the gtt, so need_fence implies need_mappable. And need_mappable is only set independantly with relocations which implies (for sane userspace) that the buffer is untiled. - The overlay code is only really used on i8xx, which doesn't have unmappable gtt. And it doesn't support tiled buffers, currently. - For all other buffers it's a bug to pass in a tiled bo. In short, this disdinction doesn't have any practical gain. I've also reverted mapping the overlay and context pages as possibly unmappable. It's not worth being overtly clever here, all the big gains from unmappable are for execbuf bos. Also add a comment for a clever optimization that confused me while reading the original patch by Chris Wilson. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-04 19:02:03 +00:00
Chris Wilson	085ce26437	drm/i915: Ensure that if we ever try to pin+fence it is mappable. When merging Daniel's full-gtt patches I had a set of tweaks which I thought I had undone. I was half right... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31286 Reported-by: jinjin.wang@intel.com Reported-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-03 09:31:57 +00:00
Chris Wilson	f2a630bfec	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/i915_gem_evict.c	2010-11-01 13:44:41 +00:00
Chris Wilson	c6afd65807	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-01 13:39:24 +00:00
Chris Wilson	0f8c6d7ca9	drm/i915: Move the invalidate\|flush information out of the device struct ... and into a local structure scoped for the single function in which it is used. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:38:44 +00:00
Chris Wilson	13b2928933	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:31:19 +00:00
Chris Wilson	5eac3ab459	drm/i915: Evict just the purgeable GTT entries on the first pass Take two passes to evict everything whilst searching for sufficient free space to bind the batchbuffer. After searching for sufficient free space using LRU eviction, evict everything that is purgeable and try again. Only then if there is insufficient free space (or the GTT is too badly fragmented) evict everything from the aperture and try one last time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-31 12:31:30 +00:00
Chris Wilson	ff75b9bc48	drm/i915: Fix typo from `e5281ccd` in i915_gem_attach_phys_object() Accessing the uninitialised obj->pages instead of the local page lead to an OOPs. Reported-by: Xavier Chantry <chantry.xavier@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-30 22:52:31 +01:00
Chris Wilson	872d860c85	drm/i915: Remove the duplicate domain-change tracepoint for GPU flush Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:54 +01:00
Chris Wilson	a00b10c360	drm/i915: Only enforce fence limits inside the GTT. So long as we adhere to the fence registers rules for alignment and no overlaps (including with unfenced accesses to linear memory) and account for the tiled access in our size allocation, we do not have to allocate the full fenced region for the object. This allows us to fight the bloat tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside the GTT we still suffer the additional alignment constraints, so it doesn't magic allow us to render larger scenes without stalls -- we need the expanded GTT and fence pipelining to overcome those...] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:07 +01:00
Chris Wilson	7465378fd7	drm/i915: Convert BUG_ON(pin_count) from an impossible condition Also spotted by Dan Carpenter. obj->pin_count is unsigned so the BUG_ON(obj->pin_count<0) will never trigger. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 10:54:29 +01:00
Chris Wilson	bbe2e11a4b	drm/i915: Do not return -1 from shrinker when nr_to_scan == 0 The error code is only expected during the actual pruning and not during the first measurement (nr_to_scan == 0) pass. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 22:35:07 +01:00
Chris Wilson	395b70be54	drm/i915: Flush read-only buffers from the active list upon idle as well It is possible for the active list to only contain a read-only buffer so that the ring->gpu_write_list remains entry. This leads to an inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing an infinite spin during the shrinker and an assertion failure that i915_gpu_idle() does indeed flush all buffers from the active lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 21:31:19 +01:00
Chris Wilson	4a684a4117	drm/i915: Kill GTT mappings when moving from GTT domain In order to force a page-fault on a GTT mapping after we start using it from the GPU and so enforce correct CPU/GPU synchronisation, we need to invalidate the mapping. Pointed out by Owain G. Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:03 +01:00
Chris Wilson	e5281ccd2e	drm/i915: Eliminate nested get/put pages By using read_cache_page() for individual pages during pwrite/pread we can eliminate an unnecessary large allocation (and immediate free) of obj->pages. Also this eliminates any potential nesting of get/put pages, simplifying the code and preparing the path for greater things. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	39a01d1fb6	drm/i915: Remove mmap_offset Since we rarely use the mmap_offset and it is easily computable from the obj->map_list.hash, remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	17250b7155	drm/i915: Make the inactive object shrinker per-device Eliminate the racy device unload by embedding a shrinker into each device. Smaller, simpler code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:01 +01:00
Chris Wilson	da761a6edf	drm/i915: Bail early if we try to mmap an object too large to be mapped. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	fb7d516af1	drm/i915: add accounting for mappable objects in gtt v2 More precisely: For those that _need_ to be mappable. Also add two BUG_ONs in fault and pin to check the consistency of the mappable flag. Changes in v2: - Add tracking of gtt mappable space (to notice mappable/unmappable balancing issues). - Improve the mappable working set tracking by tracking fault and pin separately. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	ec57d2602a	drm/i915: add mappable to gem_object_bind tracepoint This way we can make some more educated guesses as to why exactly we can't use 2G apertures to their full potential ;) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:07 +01:00
Daniel Vetter	53984635a6	drm/i915: use the complete gtt At least the part that's currently enabled by the BIOS. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:06 +01:00
Daniel Vetter	16e809acc1	drm/i915: unbind unmappable objects on fault/pin In i915_gem_object_pin obviously unbind only if mappable is true. This is the last part to enable gtt_mappable_end != gtt_size, which the next patch will do. v2: Fences on g33/pineview only work in the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	920afa77ce	drm/i915: range-restricted bind_to_gtt Like before add a parameter mappable (also to gem_object_pin) and set it depending upon the context. Only bos that are brought into the gtt due to an execbuffer call can be put into the unmappable part of the gtt, everything else (especially pinned objects) need to be put into the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	a6e0aa4214	drm/i915: range-restricted eviction support Add a mappable parameter to i915_gem_evict_something to distinguish the two cases (non-restricted vs. mappable gtt allocations). No functional changes because the mappable limit is set to the end of the gtt currently. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:04 +01:00
Chris Wilson	3cce469cab	drm/i915: Propagate error from failing to queue a request Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:03 +01:00
Chris Wilson	b2223497b4	drm/i915: Remove the confusing global waiting/irq seqno Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:30:59 +01:00
Chris Wilson	7e318e18f2	drm/i915: Move object to GPU domains after dispatching execbuffer In the event that we fail to dispatch the execbuffer, for example if there is insufficient space on the ring, we were leaving the objects in an inconsistent state. Notably they were marked as being in the GPU write domain, but were not added to the ring or any list. This would lead to inevitable oops: [ 1010.522940] [drm:i915_gem_do_execbuffer] ERROR dispatch failed -16 [ 1010.523055] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088 [ 1010.523097] IP: [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523120] PGD 14cf2f067 PUD 14ce04067 PMD 0 [ 1010.523140] Oops: 0000 [#1] SMP [ 1010.523154] last sysfs file: /sys/devices/virtual/vc/vcsa2/uevent [ 1010.523173] CPU 0 [ 1010.523183] Pid: 716, comm: X Not tainted 2.6.36+ #34 LosLunas CRB/SandyBridge Platform [ 1010.523206] RIP: 0010:[<ffffffff8122d006>] [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523233] RSP: 0018:ffff88014bf97cd8 EFLAGS: 00010296 [ 1010.523249] RAX: ffff88014e2d1808 RBX: 0000000000000000 RCX: 0000000000000000 [ 1010.523270] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 [ 1010.523290] RBP: ffff88014e2d1000 R08: 0000000000000002 R09: 00000000400c645f [ 1010.523311] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 1010.523331] R13: ffff88014e29a000 R14: 00000000000000c8 R15: ffffffff8162eb28 [ 1010.523352] FS: 00007fc62379d700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 1010.523375] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1010.523392] CR2: 0000000000000088 CR3: 000000014bf87000 CR4: 00000000000406f0 [ 1010.523412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1010.523433] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1010.523454] Process X (pid: 716, threadinfo ffff88014bf96000, task ffff88014cc1ee40) [ 1010.523475] Stack: [ 1010.523483] ffff88014d5199c0 0000000000000200 0000000000000000 ffff88014bcc6400 [ 1010.523509] <0> 0000000000000000 0000000000000001 ffff88014e29a000 ffff88014bcc6400 [ 1010.523537] <0> ffffffff8162eb28 ffffffff8122faa8 ffff88014e29a000 ffff88014bcc6400 [ 1010.523568] Call Trace: [ 1010.523578] [<ffffffff8122faa8>] ? i915_gem_object_flush_gpu_write_domain+0x48/0x80 [ 1010.523601] [<ffffffff8122fb8e>] ? i915_gem_object_set_to_gtt_domain+0x2e/0xb0 [ 1010.523623] [<ffffffff8123113b>] ? i915_gem_set_domain_ioctl+0xdb/0x1f0 [ 1010.523644] [<ffffffff8120a3f1>] ? drm_ioctl+0x3d1/0x460 [ 1010.523660] [<ffffffff81231060>] ? i915_gem_set_domain_ioctl+0x0/0x1f0 [ 1010.523682] [<ffffffff81092618>] ? vma_prio_tree_insert+0x28/0x120 [ 1010.523701] [<ffffffff8109f379>] ? vma_link+0x99/0xf0 [ 1010.523717] [<ffffffff810a111d>] ? mmap_region+0x1ed/0x4f0 [ 1010.523734] [<ffffffff810c306f>] ? do_vfs_ioctl+0x9f/0x580 [ 1010.523750] [<ffffffff810c3599>] ? sys_ioctl+0x49/0x80 [ 1010.523767] [<ffffffff810022eb>] ? system_call_fastpath+0x16/0x1b [ 1010.523785] Code: 00 00 00 00 00 41 57 89 ce 41 56 41 55 41 54 45 89 c4 55 48 89 fd 53 48 89 d3 44 89 c2 48 89 df 4c 8d b3 c8 00 00 00 48 83 ec 18 <ff> 93 88 00 00 00 48 8b 83 c8 00 00 00 4c 8b bd 30 03 00 00 48 [ 1010.523946] RIP [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523966] RSP <ffff88014bf97cd8> [ 1010.523977] CR2: 0000000000000088 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	e1f99ce6ca	drm/i915: Propagate errors from writing to ringbuffer Preparing the ringbuffer for adding new commands can fail (a timeout whilst waiting for the GPU to catch up and free some space). So check for any potential error before overwriting HEAD with new commands, and propagate that error back to the user where possible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	78501eac34	drm/i915/ringbuffer: Drop the redundant dev from the vfunc interface The ringbuffer keeps a pointer to the parent device, so we can use that instead of passing around the pointer on the stack. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 12:18:21 +01:00
Linus Torvalds	c48c43e422	Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (476 commits) vmwgfx: Implement a proper GMR eviction mechanism drm/radeon/kms: fix r6xx/7xx 1D tiling CS checker v2 drm/radeon/kms: properly compute group_size on 6xx/7xx drm/radeon/kms: fix 2D tile height alignment in the r600 CS checker drm/radeon/kms/evergreen: set the clear state to the blit state drm/radeon/kms: don't poll dac load detect. gpu: Add Intel GMA500(Poulsbo) Stub Driver drm/radeon/kms: MC vram map needs to be >= pci aperture size drm/radeon/kms: implement display watermark support for evergreen drm/radeon/kms/evergreen: add some additional safe regs v2 drm/radeon/r600: fix tiling issues in CS checker. drm/i915: Move gpu_write_list to per-ring drm/i915: Invalidate the to-ring, flush the old-ring when updating domains drm/i915/ringbuffer: Write the value passed in to the tail register agp/intel: Restore valid PTE bit for Sandybridge after `bdd3072` drm/i915: Fix flushing regression from `9af90d19f` drm/i915/sdvo: Remove unused encoding member i915: enable AVI infoframe for intel_hdmi.c [v4] drm/i915: Fix current fb blocking for page flip drm/i915: IS_IRONLAKE is synonymous with gen == 5 ... Fix up conflicts in - drivers/gpu/drm/i915/{i915_gem.c, i915/intel_overlay.c}: due to the new simplified stack-based kmap_atomic() interface - drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: added .llseek entry due to BKL removal cleanups.	2010-10-26 18:57:59 -07:00
Peter Zijlstra	3e4d3af501	mm: stack based kmap_atomic() Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:08 -07:00
Chris Wilson	641934069d	drm/i915: Move gpu_write_list to per-ring ... to prevent flush processing of an idle (or even absent) ring. This fixes a regression during suspend from `87acb0a5`. Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Tested-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-24 20:22:51 +01:00
Chris Wilson	b6651458d3	drm/i915: Invalidate the to-ring, flush the old-ring when updating domains When the object has been written to by the gpu it remains on the ring until its flush has been retired. However, when the object is moving to the ring and the associated cache needs to be invalidated, we need to perform the flush on the target ring, not the one it came from (which is NULL in the reported case and so the flush was entirely absent). Reported-by: Peter Clifton <pcjc2@cam.ac.uk> Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-23 11:07:21 +01:00
Chris Wilson	878a3c37d3	drm/i915: Fix flushing regression from `9af90d19f` Whilst moving the code around in `9af90d19f`, I dropped the or'ing in of new write domains which would zero out the write domain for a render target if later reused as a source later in the batch. This meant that we might drop a required flush before reading from the render target. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31043 Reported-by: xunx.fang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-22 10:48:12 +01:00
Chris Wilson	549f736582	drm/i915: Enable SandyBridge blitter ring Based on an original patch by Zhenyu Wang, this initializes the BLT ring for SandyBridge and enables support for user execbuffers. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-21 19:08:39 +01:00
Chris Wilson	b5dc608c98	drm/i915: Copy the updated reloc->presumed_offset back to the user If the userspace driver is using a constant relocation array with a static buffer, they will pass the same relocation array back to the kernel. So we do need to update the presumed offset value in those relocations to reflect the current object so that they remain correct with future batchbuffers and we avoid the necessity of having to suspend execution and perform redundant relocations. Fixes the regression introduced by `12f889c` for applications using absolute addressing on trees of buffer (i.e. the current consumers of libdrm_intel.so). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30996 Reported-by: Wang, Jinjin <jinjin.wang@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 21:06:34 +01:00
Chris Wilson	69dc4987cb	drm/i915: Track objects in global active list (as well as per-ring) To handle retirements, we need per-ring tracking of active objects. To handle evictions, we need global tracking of active objects. As we enable more rings, rebuilding the global list from the individual per-ring lists quickly grows tiresome and overly complicated. Tracking the active objects in two lists is the lesser of two evils. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	87acb0a550	drm/i915: Simplify most HAS_BSD() checks ... by always initialising the empty ringbuffer it is always then safe to check whether it is active. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	9af90d19f8	drm/i915: cache the last object lookup during pin_and_relocate() The most frequent relocation within a batchbuffer is a contiguous sequence of vertex buffer relocations, for which we can virtually eliminate the drm_gem_object_lookup() overhead by caching the last handle to object translation. In doing so we refactor the pin and relocate retry loop out of do_execbuffer into its own helper function and so improve the error paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:50 +01:00
Chris Wilson	1d7cfea152	drm/i915: Do interrupible mutex lock first to avoid locking for unreference One of the primarily consumers of the i915 driver is X, a large signal driven application. Frequently when writing into the buffers, there is a pending signal which causes us not to take the interruptible lock but then we need to take that same lock around the object unreference. By rearranging the code to do the interruptible lock as the first check, we can avoid the frequent additional locking around the unreference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:20:23 +01:00
Chris Wilson	4f27b75d56	drm/i915: rearrange mutex acquisition for pread ... to avoid the double acquisition along fast[er] paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:55 +01:00
Chris Wilson	fbd5a26d50	drm/i915: Rearrange acquisition of mutex during pwrite ... to avoid reacquiring it to drop the object reference count on exit. Note we have to make sure we now drop (and reacquire) the lock around acquiring the mm semaphore on the slow paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:47 +01:00
Chris Wilson	b5e4feb661	drm/i915: Attempt to prefault user pages for pread/pwrite ... in the hope that it makes the atomic fast paths more likely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:37 +01:00
Chris Wilson	202f2fef7a	drm/i915: Avoid taking the mutex for dropping the refcnt upon creation After allocation a handle for the fresh object, we know that we can safely drop the refcnt without triggering a free so we do not need the mutex. Strangely, this mutex acquisition is the one that appears on driver profiles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:28 +01:00
Chris Wilson	f0c43d9b7e	drm/i915: Perform relocations in CPU domain [if in CPU domain] Avoid an early eviction of the batch buffer into the uncached GTT domain, and so do the relocation fixup in cacheable memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:18 +01:00
Chris Wilson	2549d6c26c	drm/i915: Avoid vmallocing a buffer for the relocations ... perform an access validation check up front instead and copy them in on-demand, during i915_gem_object_pin_and_relocate(). As around 20% of the CPU overhead may be spent inside vmalloc for the relocation entries when submitting an execbuffer [for x11perf -aa10text], the savings are considerable and result in around a 10% throughput increase [for glyphs]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:18:36 +01:00
Chris Wilson	e59f2bac15	drm/i915: Wait for pending flips on the GPU Currently, if a batch buffer refers to an object with a pending flip, then we sleep until that pending flip is completed (unpinned and signalled). This is so that a flip can be queued and the user can continue rendering to the backbuffer oblivious to whether the buffer is still pinned as the scan out. (The kernel arbitrating at the last moment to stall the batch and wait until the buffer is unpinned and replaced as the front buffer.) As we only have a queue depth of 1, we can simply wait for the current pending flip to complete and continue rendering. We can achieve this with a single WAIT_FOR_EVENT command inserted into the ring buffer prior to executing the batch, without stalling the client. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-07 19:10:09 +01:00
Dave Airlie	fb7ba2114b	Merge remote branch 'korg/drm-fixes' into drm-vmware-next necessary for some of the vmware fixes to be pushed in. Conflicts: drivers/gpu/drm/drm_gem.c drivers/gpu/drm/i915/intel_fb.c include/drm/drmP.h	2010-10-06 11:10:48 +10:00
Linus Torvalds	c470af0a27	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow drm/i915: Sanity check pread/pwrite drm/i915: Use pipe state to tell when pipe is off drm/i915: vblank status not valid while training display port drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code drm/i915: Fix refleak during eviction. drm/i915: fix GMCH power reporting	2010-10-04 11:10:26 -07:00
Chris Wilson	35b62a89b0	drm/i915: Skip pread/pwrite if size to copy is 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-04 10:07:46 +01:00
Chris Wilson	df6d075a4d	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-10-04 10:07:38 +01:00
Chris Wilson	7dcd2499de	drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow ... and do the same for pread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:18 +01:00
Chris Wilson	ce9d419dbe	drm/i915: Sanity check pread/pwrite Move the access control up from the fast paths, which are no longer universally taken first, up into the caller. This then duplicates some sanity checking along the slow paths, but is much simpler. Tracked as CVE-2010-2962. Reported-by: Kees Cook <kees@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:17 +01:00
Chris Wilson	58e10eb92d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem_evict.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c	2010-10-03 10:56:11 +01:00
Julia Lawall	929f49bf22	drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code Extend the error handling code with operations found in other nearby error handling code A simplified version of the sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ @r@ statement S1,S2,S3; constant C1,C2,C3; @@ if (...) {... S1 return -C1;} ... if (...) {... when != S1 return -C2;} ... *if (...) {... S1 return -C3;} // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-02 15:21:26 +01:00
Chris Wilson	1cdf7fef79	drm/i915: Don't mask the return code whilst relocating. The return from move_to_gtt_domain() may indicate a pending signal which needs to handled as opposed to an actual error, for instance, so report the original return value rather than forcing an EINVAL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-02 15:12:41 +01:00
Linus Torvalds	18ffe4b18c	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: vmwgfx: Fix fb VRAM pinning failure due to fragmentation vmwgfx: Remove initialisation of dev::devname vmwgfx: Enable use of the vblank system vmwgfx: vt-switch (master drop) fixes drm/vmwgfx: Fix breakage introduced by commit "drm: block userspace under allocating buffer and having drivers overwrite it (v2)" drm: Hold the mutex when dropping the last GEM reference (v2) drm/gem: handlecount isn't really a kref so don't make it one. drm: i810/i830: fix locked ioctl variant drm/radeon/kms: add quirk for MSI K9A2GM motherboard drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle drm: Prune GEM vma entries drm/radeon/kms: fix up encoder info messages for DFP6 drm/radeon: fix PCI ID 5657 to be an RV410	2010-10-01 10:58:31 -07:00
Chris Wilson	069efc1dac	drm/i915: Clear fence registers on GPU reset When the GPU is reset, the fence registers are invalidated, so release the objects and clear them out. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:22 +01:00
Chris Wilson	812ed49243	drm/i915: Force the domain to CPU on unbinding whilst wedged. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30083 Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:21 +01:00
Chris Wilson	73aa808f10	drm: Move the GTT accounting to i915 Only drm/i915 does the bookkeeping that makes the information useful, and the information maintained is driver specific, so move it out of the core and into its single user. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com>	2010-10-01 14:45:20 +01:00
Dave Airlie	29d08b3efd	drm/gem: handlecount isn't really a kref so don't make it one. There were lots of places being inconsistent since handle count looked like a kref but it really wasn't. Fix this my just making handle count an atomic on the object, and have it increase the normal object kref. Now i915/radeon/nouveau drivers can drop the normal reference on userspace object creation, and have the handle hold it. This patch fixes a memory leak or corruption on unload, because the driver had no way of knowing if a handle had been actually added for this object, and the fbcon object needed to know this to clean itself up properly. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-10-01 09:17:44 +10:00
Chris Wilson	f394940b8d	drm/i915: Remove redundant deletion of obj->gpu_write_list At that point as the object is no longer in any GPU write domain it must not be on the list, so the list_del() is redundant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:51 +01:00
Chris Wilson	5cdf588174	drm/i915: Make get/put pages static Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:13 +01:00
Chris Wilson	23bc598253	drm/i915/debug: Convert i915_verify_active() to scan all lists ... and check more regularly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:11 +01:00
Chris Wilson	891b48cfc8	drm/i915: Avoid blocking the kworker thread on a stuck mutex Just reschedule the retire requests again if the device is currently busy. The request list will be pruned along other paths so will never grow unbounded and so we can afford to miss the occasional pruning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 12:26:37 +01:00
Chris Wilson	3d2a812ae4	drm/i915/debug: Remove default WATCH_BUF Replaced by tracepoints. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:19 +01:00
Chris Wilson	97d1ebaf81	drm/i915/debug: Remove defunct WATCH_LRU This has bitrotted through inuse and superseded by tracing and debugfs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:18 +01:00
Chris Wilson	e0e41598b4	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-09-28 15:48:38 +01:00
Chris Wilson	a56ba56c27	Revert "drm/i915: Drop ring->lazy_request" With multiple rings generating requests independently, the outstanding requests must also be track independently. Reported-by: Wang Jinjin <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-28 11:30:52 +01:00
Chris Wilson	ced270fa89	drm/i915: Ensure that the mode change flushing is currently uninterruptible Introduced by `48b956c5`, I had thought I had already fixed this. Oh well. Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 22:50:36 +01:00
Chris Wilson	1c25595f8d	drm/i915: Convert the file mutex into a spinlock Daniel Vetter pointed out that in this case is would be clearer and cleaner to use a spinlock instead of a mutex to protect the per-file request list manipulation. Make it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 11:03:27 +01:00
Chris Wilson	76c1dec197	drm/i915: Make the mutex_lock interruptible on ioctl paths ... and combine it with the wedged completion handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	30dbf0c07f	drm/i915: Adjust hangcheck EIO semantics Owain Ainsworth reported an issue between the interaction of the hangcheck and userspace immediately (and permanently) falling back to s/w rasterisation. In order to break the mutex and begin resetting the GPU, we must abort the current operation (usually within the wait) and climb sufficiently far back up the call chain to drop the mutex. In his implementation, Owain has a loop within the ioctl handler to detect the hang and then sleep until the error handler has run. I've chosen to return to userspace and report an EAGAIN which should trigger the userspace ioctl handler to repeat the call (simply because it felt less invasive...). Before hitting a wedged GPU, we then wait upon completion of the error handler. Reported-by: Owain G. Ainsworth <zerooa@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	f787a5f59e	drm/i915: Only hold a process-local lock whilst throttling. Avoid cause latencies in other clients by not taking the global struct mutex and moving the per-client request manipulation a local per-client mutex. For example, this allows a compositor to schedule a page-flip (through X) whilst an OpenGL application is monopolising the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 21:03:00 +01:00
Chris Wilson	e6c3a2a6d3	drm/i915: Use an uninterruptible wait for page-flips during modeset We need to drain the pending flips prior to disabling the pipe during modeset, and these need to be done in an uninterruptible fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 14:19:57 +01:00

1 2 3 4 5 ...

443 Commits