linux

Author	SHA1	Message	Date
Ben Widawsky	2911a35b2e	drm/i915: use semaphores for the display plane In theory this will have performance and power improvements. Performance because we don't need to stall when the scanout BO is busy, and power because we don't have to stall when the BO is busy (and the ring can even go to sleep if the HW supports it). v2: squash 2 patches into 1 (me) un-inline the enable_semaphores function (Daniel) remove comment about SNB hangs from i915_gem_object_sync (Chris) rename intel_enable_semaphores to i915_semaphore_is_enabled (me) removed page flip comment; "no why" (Chris) To address other comments from Daniel (irc): update the comment to say 'vt-d is crap, don't enable semaphores' - I think you misinterpreted Chris' comment, it already exists. checking out whether we can pageflip on the render ring on ivb (didn't work on early silicon) - We don't want to enable workarounds for early silicon unless we have to. - I can't find any references in the docs about this. optionally use it if the fb is already busy on the render ring - This should be how the code already worked, unless I am misunderstanding your meaning. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:05 +02:00
Chris Wilson	9a5a53b392	drm/i915: Reorganise rules for get_fence/put_fence By simplifying the rules to calling get_fence when writing to the through the GTT in a tiled manner, and calling put_fence before writing to the object through the GTT in a linear manner, the code becomes clearer and there is less chance of making a mistake. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [danvet: fixed up conflict with ppgtt code and spelling in a new comment.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 21:14:04 +02:00
Dave Airlie	effbc4fd8e	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel into drm-core-next Daniel Vetter wrote First pull request for 3.5-next, slightly large than usual because new things kept coming in since the last pull for 3.4. Highlights: - first batch of hw enablement for vlv (Jesse et al) and hsw (Eugeni). pci ids are not yet added, and there's still quite a few patches to merge (mostly modesetting). To make QA easier I've decided to merge this stuff in pieces. - loads of cleanups and prep patches spurred by the above. Especially vlv is a real frankenstein chip, but also hsw is stretching our driver's code design. Expect more to come in this area for 3.5. - more gmbus fixes, cleanups and improvements by Daniel Kurtz. Again, there are more patches needed (and some already queued up), but I wanted to split this a bit for better testing. - pwrite/pread rework and retuning. This series has been in the works for a few months already and a lot of i-g-t tests have been created for it. Now it's finally ready to be merged. Note that one patch in this series touches include/pagemap.h, that patch is acked-by akpm. - reduce mappable pressure and relocation throughput improvements from Chris. - mmap offset exhaustion mitigation by Chris Wilson. - a start at figuring out which codepaths in our messy dri1/ums+gem/kms driver we actually need to support by bailing out of unsupported case. The driver now refuses to load without kms on gen6+ and disallows a few ioctls that userspace never used in certain cases. More of this will definitely come. - More decoupling of global gtt and ppgtt. - Improved dual-link lvds detection by Takashi Iwai. - Shut up the compiler + plus fix the fallout (Ben) - Inverted panel brightness handling (mostly Acer manages to break things in this way). - Small fixlets and adjustements and some minor things to help debugging. Regression-wise QA reported quite a few issues on ivb, but all of them turned out to be hw stability issues which are already fixed in drm-intel-fixes (QA runs the nightly regression tests on -next alone, without -fixes automatically merged in). There's still one issue open on snb, it looks like occlusion query writes are not quite as cache coherent as we've expected. With some of the pwrite adjustements we can now reliably hit this. Kernel workaround for it is in the works." * 'drm-intel-next' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits) drm/i915: VCS is not the last ring drm/i915: Add a dual link lvds quirk for MacBook Pro 8,2 drm/i915: make quirks more verbose drm/i915: dump the DMA fetch addr register on pre-gen6 drm/i915/sdvo: Include YRPB as an additional TV output type drm/i915: disallow gem init ioctl on ilk drm/i915: refuse to load on gen6+ without kms drm/i915: extract gt interrupt handler drm/i915: use render gen to switch ring irq functions drm/i915: rip out old HWSTAM missed irq WA for vlv drm/i915: open code gen6+ ring irqs drm/i915: ring irq cleanups drm/i915: add SFUSE_STRAP registers for digital port detection drm/i915: add WM_LINETIME registers drm/i915: add WRPLL clocks drm/i915: add LCPLL control registers drm/i915: add SSC offsets for SBI access drm/i915: add port clock selection support for HSW drm/i915: add S PLL control drm/i915: add PIXCLK_GATE register ... Conflicts: drivers/char/agp/intel-agp.h drivers/char/agp/intel-gtt.c drivers/gpu/drm/i915/i915_debugfs.c	2012-04-12 10:27:01 +01:00
Daniel Vetter	15a13bbdff	drm/i915: clear fencing tracking state when retiring requests This fixes a resume regression introduced in commit `7dd4906586` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Mar 21 10:48:18 2012 +0000 drm/i915: Mark untiled BLT commands as fenced on gen2/3 which fixed fencing tracking for untiled blt commands. A side effect of that patch was that now also untiled objects have a non-zero obj->last_fenced_seqno to track when a fence can be set up after a pipelined tiling change. Unfortunately this was only cleared by the fence setup and teardown code, resulting in tons of untiled but inactive objects with non-zero last_fenced_seqno. Now after resume we completely reset the seqno tracking, both on the driver side (by setting dev_priv->next_seqno = 1) and on the hw side (by allocating a new hws page, which contains the seqnos). Hilarity and indefinite waits ensued from the stale seqnos in obj->last_fenced_seqno from before the suspend. The fix is to properly clear the fencing tracking state like we already do for the normal gpu rendering while moving objects off the active list. Reported-and-tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jiri Slaby <jslaby@suse.cz> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-12 09:02:37 +02:00
Daniel Vetter	f534bc0b22	drm/i915: disallow gem init ioctl on ilk Ums is already disabled, but on ilk we can additionally disable gem initialization when using user mode setting. Upstream never support ilk without kernel modesetting and not even the RHEL ilk ums backport needs gem - that driver is based on xf86-video-intel version 2.2, which is pre-gem. Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-09 18:04:08 +02:00
Chris Wilson	7dd4906586	drm/i915: Mark untiled BLT commands as fenced on gen2/3 The BLT commands on gen2/3 utilize the fence registers and so we cannot modify any fences for the object whilst those commands are in flight. Currently we marked tiled commands as occupying a fence, but forgot to restrict the untiled commands from preventing a fence being assigned before they were completed. One side-effect is that we ten have to double check that a fence was allocated for a fenced buffer during move-to-active. Reported-by: Jiri Slaby <jirislaby@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43427 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47990 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Testcase: i-g-t/tests/gem_tiled_after_untiled_blt Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-01 12:26:05 +02:00
Daniel Vetter	55a254ac63	drm/i915: properly restore the ppgtt page directory on resume The ppgtt page directory lives in a snatched part of the gtt pte range. Which naturally gets cleared on hibernate when we pull the power. Suspend to ram (which is what I've tested) works because despite the fact that this is a mmio region, it is actually back by system ram. Fix this by moving the page directory setup code to the ppgtt init code (which gets called on resume). This fixes hibernate on my ivb and snb. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-04-01 12:25:29 +02:00
Jesse Barnes	23e3f9b37e	drm/i915: check for disabled interrupts on ValleyView Haven't seen this yet, but it doesn't hurt. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-29 00:11:46 +02:00
Daniel Vetter	e7e58eb5c0	drm/i915: mark pwrite/pread slowpaths with unlikely Beside helping the compiler untangle this maze they double-up as documentation for which parts of the code aren't performance-critical but just around to keep old (but already dead-slow) userspace from breaking. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:41:41 +02:00
Daniel Vetter	23c18c71da	drm/i915: fixup in-line clflushing on bit17 swizzled bos The issue is that with inline clflushing the clflushing isn't properly swizzled. Fix this by - always clflushing entire 128 byte chunks and - unconditionally flush before writes when swizzling a given page. We could be clever and check whether we pwrite a partial 128 byte chunk instead of a partial cacheline, but I've figured that's not worth it. Now the usual approach is to fold this into the original patch series, but I've opted against this because - this fixes a corner case only very old userspace relies on and - I'd like to not invalidate all the testing the pwrite rewrite has gotten. This fixes the regression notice by tests/gem_tiled_partial_prite_pread from i-g-t. Unfortunately it doesn't fix the issues with partial pwrites to tiled buffers on bit17 swizzling machines. But that is also broken without the pwrite patches, so likely a different issue (or a problem with the testcase). v2: Simplify the patch by dropping the overly clever partial write logic for swizzled pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:40:57 +02:00
Daniel Vetter	f56f821feb	mm: extend prefault helpers to fault in more than PAGE_SIZE drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. Add new functions in filemap.h to make that possible. Also kill a copy&pasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. v3: Becaus I couldn't find a way around adding a uaddr += PAGE_SIZE to the filemap.c hotpaths (that the compiler couldn't remove again), let's go with separate new functions for the multipage use-case. v4: Adjust comment to CodingStlye and fix spelling. Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:36:30 +02:00
Daniel Vetter	d174bd6472	drm/i915: extract copy helpers from shmem_pread\|pwrite While moving around things, this two functions slowly grew out of any sane bounds. So extract a few lines that do the copying and clflushing. Also add a few comments to explain what's going on. v2: Again do s/needs_clflush/needs_clflush_after/ in the write paths as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:30:33 +02:00
Daniel Vetter	117babcdd5	drm/i915: use uncached writes in pwrite It's around 20% faster. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:29:38 +02:00
Daniel Vetter	ffc62976d2	drm/i915: fall back to shmem pwrite when the buffer is not accessible It's too expensive to move it around just for that pwrite, especially when we're trashing on the mappable gtt part like crazy. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:29:08 +02:00
Daniel Vetter	586428852a	drm/i915: implement inline clflush for pwrite In micro-benchmarking of the usual pwrite use-pattern of alternating pwrites with gtt domain reads from the gpu, this yields around 30% improvement of pwrite throughput across all buffers size. The trick is that we can avoid clflush cachelines that we will overwrite completely anyway. Furthermore for partial pwrites it gives a proportional speedup on top of the 30% percent because we only clflush back the part of the buffer we're actually writing. v2: Simplify the clflush-before-write logic, as suggested by Chris Wilson. v3: Finishing touches suggested by Chris Wilson: - add comment to needs_clflush_before and only set this if the bo is uncached. - s/needs_clflush/needs_clflush_after/ in the write paths to clearly differentiate it from needs_clflush_before. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:28:45 +02:00
Daniel Vetter	96d79b5270	drm/i915: don't clobber userspace memory before commiting to the pread The pagemap.h prefault helpers do the prefaulting by simply writing some data into every page. Hence we should not prefault when we're not yet commited to to actually writing data to userspace. The problem is now that - we can't prefault while holding dev->struct_mutex for we could deadlock with our own pagefault handler - we need to grab dev->struct_mutex before copying to sync up with any outsanding gpu writes. Therefore only prefault when we're dropping the lock the first time in the pread slowpath - at that point we're committed to the write, don't wait on the gpu anymore and hence won't return early (with e.g. -EINTR). Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:28:32 +02:00
Daniel Vetter	935aaa692e	drm/i915: drop gtt slowpath With the proper prefault, it's extremely unlikely that we fall back to the gtt slowpath. So just kill it and use the shmem_pwrite path as fallback. To further clean up the code, move the preparatory gem calls into the respective pwrite functions. This way the gtt_fast->shmem fallback is much more obvious. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:27:21 +02:00
Daniel Vetter	692a576b9d	drm/i915: don't call shmem_read_mapping unnecessarily This speeds up pwrite and pread from ~120 µs ro ~100 µs for reading/writing 1mb on my snb (if the backing storage pages are already pinned, of course). v2: Chris Wilson pointed out a glaring page reference bug - I've unconditionally dropped the reference. With that fixed (and the associated reduction of dirt in dmesg) it's now even a notch faster. v3: Unconditionaly grab a page reference when dropping dev->struct_mutex to simplify the code-flow. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:27:03 +02:00
Daniel Vetter	3ae5378330	drm/i915: don't use gtt_pwrite on LLC cached objects ~120 µs instead fo ~210 µs to write 1mb on my snb. I like this. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:25:45 +02:00
Daniel Vetter	a0356fc373	drm/i915: kill ranged cpu read domain support No longer needed. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:25:32 +02:00
Daniel Vetter	8489731c9b	drm/i915: move clflushing into shmem_pread This is obviously gonna slow down pread. But for a half-way realistic micro-benchmark, it doesn't matter: Non-broken userspace reads back data from the gpu once before the gpu again dirties it. So all this ranged clflush tracking is just a waste of time. No pread performance change (neglecting the dumb benchmark of constantly reading the same data) measured. As an added bonus, this avoids clflush on read on coherent objects. Which means that partial preads on snb are now roughly 4x as fast. This will be usefull for e.g. the libva encoder - when I finally get around to fix that up. v2: Properly sync with the gpu on LLC machines. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:20:01 +02:00
Daniel Vetter	dbf7bff074	drm/i915: merge shmem_pread slow&fast-path With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:19:11 +02:00
Daniel Vetter	e244a443bf	drm/i915: merge shmem_pwrite slow&fast-path With the previous rewrite, they've become essential identical. v2: Simplify the page_do_bit17_swizzling logic as suggested by Chris Wilson. Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:18:58 +02:00
Chris Wilson	dabdfe021a	drm/i915: Avoid using mappable space for relocation processing through the CPU We try to avoid writing the relocations through the uncached GTT, if the buffer is currently in the CPU write domain and so will be flushed out to main memory afterwards anyway. Also on SandyBridge we can safely write to the pages in cacheable memory, so long as the buffer is LLC mapped. In either of these cases, we therefore do not need to force the reallocation of the buffer into the mappable region of the GTT, reducing the aperture pressure. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:16:17 +02:00
Daniel Vetter	644ec02b5d	drm/i915: s/i915_gem_do_init/i915_gem_init_global_gtt ... because this is what it actually doesn now that we have the global gtt vs. ppgtt split. Also move it to the other global gtt functions in i915_gem_gtt.c Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-27 13:14:59 +02:00
Chris Wilson	a14917eeb2	drm/i915: Release the mmap offset when purging a buffer If we discard a buffer due to memory pressure, also release its alloted mmap address space. As it may be sometime before userspace wakes up and notices that it has buffers to purge from its cache, we may waste valuable address space on unusable objects for a period of time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47738 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-23 11:04:35 +01:00
Ben Widawsky	eb2c0c818a	drm/i915: [dinq] shut up two instances -Wunitialized Introduced in commit `8461d226` and `8c59967c` Signed-off-by: Ben Widawsky <ben@bwidawsk.net> [danvet: s/fix/shut up/ in the commit msg.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-22 17:44:09 +01:00
Daniel Vetter	0ebb982993	drm/i915: enable lazy global-gtt binding Now that everything is in place, only bind to the global gtt when actually required. Patch split-up suggested by Chris Wilson. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:55:16 +01:00
Daniel Vetter	74898d7edc	drm/i915: bind objects to the global gtt only when needed And track the existence of such a binding similar to the aliasing ppgtt case. Speeds up binding/unbinding in the common case where we only need a ppgtt binding (which is accessed in a cpu coherent fashion by the gpu) and no gloabl gtt binding (which needs uc writes for the ptes). This patch just puts the required tracking in place. v2: Check that global gtt mappings exist in the error_state capture code (with Chris Wilson's llc reloc patches batchbuffers are no longer relocated as mappable in all situations, so this matters). Suggested by Chris Wilson. v3: Adapted to Chris' latest llc-reloc patches. v4: Fix a bug in the i915 error state capture code noticed by Chris Wilson. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:52:01 +01:00
Daniel Vetter	741639079c	drm/i915: split out dma mapping from global gtt bind/unbind functions Note that there's a functional change buried in this patch wrt the ilk dmar workaround: We now only idle the gpu while tearing down the dmar mappings, not while clearing the gtt. Keeping the current semantics would have made for some really ugly code and afaik the issue is only with the dmar unmapping that needs a fully idle gpu. Reviewed-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-20 21:51:41 +01:00
Chris Wilson	c501ae7f33	drm/i915: Only clear the GPU domains upon a successful finish By clearing the GPU read domains before waiting upon the buffer, we run the risk of the wait being interrupted and the domains prematurely cleared. The next time we attempt to wait upon the buffer (after userspace handles the signal), we believe that the buffer is idle and so skip the wait. There are a number of bugs across all generations which show signs of an overly haste reuse of active buffers. Such as: https://bugs.freedesktop.org/show_bug.cgi?id=29046 https://bugs.freedesktop.org/show_bug.cgi?id=35863 https://bugs.freedesktop.org/show_bug.cgi?id=38952 https://bugs.freedesktop.org/show_bug.cgi?id=40282 https://bugs.freedesktop.org/show_bug.cgi?id=41098 https://bugs.freedesktop.org/show_bug.cgi?id=41102 https://bugs.freedesktop.org/show_bug.cgi?id=41284 https://bugs.freedesktop.org/show_bug.cgi?id=42141 A couple of those pre-date i915_gem_object_finish_gpu(), so may be unrelated (such as a wild write from a userspace command buffer), but this does look like a convincing cause for most of those bugs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-01 21:36:13 +01:00
Chris Wilson	eadb29a9c5	drm/i915: Silence the error message from i915_wait_request() This error message has since been superseded by the hangcheck, and does not add any salient information beyond that already printed by hangcheck discovering the GPU hang that lead to i915_wait_request() bombing out in the first place. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-27 18:08:22 +01:00
Chris Wilson	a71d8d9452	drm/i915: Record the tail at each request and use it to estimate the head By recording the location of every request in the ringbuffer, we know that in order to retire the request the GPU must have finished reading it and so the GPU head is now beyond the tail of the request. We can therefore provide a conservative estimate of where the GPU is reading from in order to avoid having to read back the ring buffer registers when polling for space upon starting a new write into the ringbuffer. A secondary effect is that this allows us to convert intel_ring_buffer_wait() to use i915_wait_request() and so consolidate upon the single function to handle the complicated task of waiting upon the GPU. A necessary precaution is that we need to make that wait uninterruptible to match the existing conditions as all the callers of intel_ring_begin() have not been audited to handle ERESTARTSYS correctly. By using a conservative estimate for the head, and always processing all outstanding requests first, we prevent a race condition between using the estimate and direct reads of I915_RING_HEAD which could result in the value of the head going backwards, and the tail overflowing once again. We are also careful to mark any request that we skip over in order to free space in ring as consumed which provides a self-consistency check. Given sufficient abuse, such as a set of unthrottled GPU bound cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on Sandy Bridge (i5-2520m): firefox-paintball 18927ms -> 15646ms: 1.21x speedup firefox-fishtank 12563ms -> 11278ms: 1.11x speedup which is a mild consolation for the performance those traces achieved from exploiting the buggy autoreported head. v2: Add a few more comments and make request->tail a conservative estimate as suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: resolve conflicts with retirement defering and the lack of the autoreport head removal (that will go in through -fixes).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-15 14:26:03 +01:00
Daniel Vetter	53d227f282	drm/i915: fixup seqno allocation logic for lazy_request Currently we reserve seqnos only when we emit the request to the ring (by bumping dev_priv->next_seqno), but start using it much earlier for ring->oustanding_lazy_request. When 2 threads compete for the gpu and run on two different rings (e.g. ddx on blitter vs. compositor) hilarity ensued, especially when we get constantly interrupted while reserving buffers. Breakage seems to have been introduced in commit `6f392d5486` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Aug 7 11:01:22 2010 +0100 drm/i915: Use a common seqno for all rings. This patch fixes up the seqno reservation logic by moving it into i915_gem_next_request_seqno. The ring->add_request functions now superflously still return the new seqno through a pointer, that will be refactored in the next patch. Note that with this change we now unconditionally allocate a seqno, even when ->add_request might fail because the rings are full and the gpu died. But this does not open up a new can of worms because we can already leave behind an outstanding_request_seqno if e.g. the caller gets interrupted with a signal while stalling for the gpu in the eviciton paths. And with the bugfix we only ever have one seqno allocated per ring (and only that ring), so there are no ordering issues with multiple outstanding seqnos on the same ring. v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it clear that we only have one seqno counter for all rings. Suggested by Chris Wilson. v3: As suggested by Chris Wilson use i915_gem_next_request_seqno instead of ring->oustanding_lazy_request to make the follow-up refactoring more clearly correct. Also improve the commit message with issues discussed on irc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:57 +01:00
Daniel Vetter	5391d0cffe	drm/i915: outstanding_lazy_request is a u32 So don't assign it false, that's just confusing ... No functional change here. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:48 +01:00
Daniel Vetter	e21af88d39	drm/i915: enable ppgtt We want to unconditionally enable ppgtt for two reasons: - Windows uses this on snb and later. - We need the basic hw support to work before we can think about real per-process address spaces and other cool features we want. But Chris Wilson was complaining all over irc and intel-gfx that this will blow up if we don't have a module option to disable it. Hence add one, to prevent this. ppgtt support seems to slightly change the timings and make crashy things slightly more or less crashy. Now in my testing and the testing this got on troublesome snb machines, it seems to have improved things only. But on ivb it makes quite a few crashes happen much more often, see https://bugs.freedesktop.org/show_bug.cgi?id=41353 Luckily Eugeni Dodonov seems to have a set of workarounds that fix this issue. v2: Don't try to enable ppgtt on pre-snb. v3: Pimp commit message and make Chris Wilson less grumpy by adding a module option. v4: New try at making Chris Wilson happy. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:49:30 +01:00
Daniel Vetter	7bddb01fb9	drm/i915: ppgtt binding/unbinding support This adds support to bind/unbind objects and wires it up. Objects are only put into the ppgtt when necessary, i.e. at execbuf time. Objects are still unconditionally put into the global gtt. v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind like for the global gtt function. Noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:25:23 +01:00
Daniel Vetter	11782b0233	drm/i915: consolidate swizzling control bit frobbing On gen5 we also need to correctly set up swizzling in the display scanout engine, but only there. Consolidate this into the same function. This has a small effect on ums setups - the kernel now also sets this bit in addition to userspace setting it. Given that this code only runs when userspace either can't (resume, gpu reset) or explicitly won't(gem_init) touch the hw this shouldn't have an adverse effect. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:18:27 +01:00
Daniel Vetter	f691e2f4ce	drm/i915: swizzling support for snb/ivb We have to do this manually. Somebody had a Great Idea. I've measured speed-ups just a few percent above the noise level (below 5% for the best case), but no slowdows. Chris Wilson measured quite a bit more (10-20% above the usual snb variance) on a more recent and better tuned version of sna, but also recorded a few slow-downs on benchmarks know for uglier amounts of snb-induced variance. v2: Incorporate Ben Widawsky's preliminary review comments and elaborate a bit about the performance impact in the changelog. v3: Add a comment as to why we don't need to check the 3rd memory channel. v4: Fixup whitespace. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:16:24 +01:00
Daniel Vetter	8461d22677	drm/i915: rewrite shmem_pread_slow to use copy_to_user Like for shmem_pwrite_slow. The only difference is that because we read data, we can leave the fetched cachelines in the cpu: In the case that the object isn't in the cpu read domain anymore, the clflush for the next cpu read domain invalidation will simply drop these cachelines. slow_shmem_bit17_copy is now ununsed, so kill it. With this patch tests/gem_mmap_gtt now actually works. v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson. v3: Fixup the swizzling logic, it swizzled the wrong pages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:34 +01:00
Daniel Vetter	8c59967c44	drm/i915: rewrite shmem_pwrite_slow to use copy_from_user ... instead of get_user_pages, because that fails on non page-backed user addresses like e.g. a gtt mapping of a bo. To get there essentially copy the vfs read path into pagecache. We can't call that right away because we have to take care of bit17 swizzling. To not deadlock with our own pagefault handler we need to completely drop struct_mutex, reducing the atomicty-guarantees of our userspace abi. Implications for racing with other gem ioctl: - execbuf, pwrite, pread: Due to -EFAULT fallback to slow paths there's already the risk of the pwrite call not being atomic, no degration. - read/write access to mmaps: already fully racy, no degration. - set_tiling: Calling set_tiling while reading/writing is already pretty much undefined, now it just got a bit worse. set_tiling is only called by libdrm on unused/new bos, so no problem. - set_domain: When changing to the gtt domain while copying (without any read/write access, e.g. for synchronization), we might leave unflushed data in the cpu caches. The clflush_object at the end of pwrite_slow takes care of this problem. - truncating of purgeable objects: the shmem_read_mapping_page call could reinstate backing storage for truncated objects. The check at the end of pwrite_slow takes care of this. v2: - add missing intel_gtt_chipset_flush - add __ to copy_from_user_swizzled as suggest by Chris Wilson. v3: Fixup bit17 swizzling, it swizzled the wrong pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:21 +01:00
Daniel Vetter	5c0480f21f	drm/i915: fall through pwrite_gtt_slow to the shmem slow path The gtt_pwrite slowpath grabs the userspace memory with get_user_pages. This will not work for non-page backed memory, like a gtt mmapped gem object. Hence fall throuh to the shmem paths if we hit -EFAULT in the gtt paths. Now the shmem paths have exactly the same problem, but this way we only need to rearrange the code in one write path. v2: v1 accidentaly falls back to shmem pwrite for phys objects. Fixed. v3: Make the codeflow around phys_pwrite cleara as suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:07 +01:00
Chris Wilson	068c6ff1cb	drm/i915: Remove the upper limit on the bo size for mapping into the CPU domain The original intention of comparing the bo against the mappable GTT limits was to prevent a subsequent faulting of the bo into the GTT from clearing the entire GTT in vain. However, that was clearly a cut'n'paste mistake as a CPU mapping never binds the bo into the aperture. Whilst there may be some merit to limiting the maximum size of the bo to something that can be utilized by the GPU, that limit itself does not belong as a safeguard to mmapping the bo, so remove the check entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 17:54:35 +01:00
Daniel Vetter	39965b3766	drm/i915: don't trash the gtt when running out of fences With the fence accounting fixed up in the previous commit not finding enough fences is a fatal error and userspace bug. Trashing the entire gtt is not gonna turn up that missing fence, so don't to this by returning another error thatn ENOSPC. This has the added benefit that it's easier to distinguish fence accounting errors from gtt space accounting issues. TTM serves as precendence for the EDEADLK error code - it returns it when the reservation code needs resources already blocked by the current reservation. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:24:10 +01:00
Chris Wilson	1690e1eb7a	drm/i915: Separate fence pin counting from normal bind pin counting In order to correctly account for reserving space in the GTT and fences for a batch buffer, we need to independently track whether the fence is pinned due to a fenced GPU access in the batch or whether the buffer is pinned in the aperture. Currently we count the fenced as pinned if the buffer has already been seen in the execbuffer. This leads to a false accounting of available fence registers, causing frequent mass evictions. Worse, if coupled with the change to make i915_gem_object_get_fence() report EDADLK upon fence starvation, the batchbuffer can fail with only one fence required... Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Paul Neumann <paul104x@yahoo.de> [danvet: Resolve the functional conflict with Jesse Barnes sprite patches, acked by Chris Wilson on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:23:37 +01:00
Ben Widawsky	b93f9cf14e	drm/i915: argument to control retiring behavior Sometimes it may be the case when we idle the gpu or wait on something we don't actually want to process the retiring list. This patch allows callers to choose the behavior. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-26 11:19:19 +01:00
Eugeni Dodonov	3d29b842e5	drm/i915: add a LLC feature flag in device description LLC is not SNB/IVB-specific, so we should check for it in a more generic way. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-17 20:01:45 +01:00
Eric Anholt	e959b5db4a	drm/i915: Make the fallback IRQ wait not sleep. The waits we do here are generally so short that sleeping is a bad idea unless we have an IRQ to wake us up. Improves regression test performance from 18 minutes to 3.5 minutes on gen7, which is now consistent with the previous generation. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:16 -08:00
Eric Anholt	7ea29b13e5	drm/i915: Do the fallback non-IRQ wait in ring throttle, too. As a workaround for IRQ synchronization issues in the gen7 BLT ring, we want to turn the two wait functions into polling loops. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:14 -08:00
Linus Torvalds	ed4a51842a	Revert "drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a" This reverts commit `eb1711bb94`. It blows up the i915 seqno tracking, resulting in the BUG_ON(seqno == 0); in i915_wait_request() triggering, which will cause lock-ups. See for example https://bugs.launchpad.net/ubuntu/+source/linux/+bug/903010 https://lkml.org/lkml/2011/12/14/395 Reported-requested-and-tested-by: Dirk Hohndel <dirk@hohndel.org> Reported-by: Richard Eames <Richard.Eames@flinders.edu.au> Reported-by: Rocko Requin <rockorequin@hotmail.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-16 12:58:39 -08:00
Daniel Vetter	eb1711bb94	drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a The recursion loop goes retire_requests->unbind->gpu_idle->retire_reqeusts. Every time we go through this we need a - active object that can be retired - and there are no other references to that object than the one from the active list, so that it gets unbound and freed immediately. Otherwise the recursion stops. So the recursion is only limited by the number of objects that fit these requirements sitting in the active list any time retire_request is called. Issue exercised by tests/gem_unref_active_buffers from i-g-t. There's been a decent bikeshed discussion whether it wouldn't be better to pass around a flag, but imo this is o.k. for such a limited case that only supports a w/a. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson> [ickle- we built better bikesheds, but this keeps the rain off for now] Tested-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-12-07 10:44:40 +00:00
Rakib Mullick	457eafce61	drm, i915: Fix memory leak in i915_gem_busy_ioctl(). A call to i915_add_request() has been made in function i915_gem_busy_ioctl(). i915_add_request can fail, so in it's exit path previously allocated memory needs to be freed. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-17 12:57:45 -08:00
Jesse Barnes	680da876f4	drm/i915: enable cacheable objects on Ivybridge IVB supports these bits as well. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 16:17:57 -07:00
Daniel Vetter	4b9de737fa	drm/i915: add constants to size fence arrays and fields In preparation of to support 32 fences on Ivybdrigde. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 09:20:37 -07:00
Eric Anholt	ff56b0bc84	drm/i915: Fix object refcount leak on mmappable size limit error path. I've been seeing memory leaks on my system in the form of large (300-400MB) GEM objects created by now-dead processes laying around clogging up memory. I usually notice when it gets to about 1.2GB of them. Hopefully this clears up the issue, but I just found this bug by inspection. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-01 09:15:17 -07:00
Ben Widawsky	f372b85463	drm/i915: Remove early exit on i915_gpu_idle [Description from: Daniel Vetter] I've just discussed this quickly with Chris on irc and it's probably best to just kill the list_empty early bailout. gpu_idle isn't a fastpath, so who cares. One candidate where we emit commands to the ring without adding anything onto these lists is e.g. pageflip. There are probably more. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:38 -07:00
Daniel Vetter	130c2561de	drm/i915: drop KM_USER0 argument to k(un)map_atomic Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:37 -07:00
Chris Wilson	8ffc024681	drm/i915: Defend against userspace creating a gem object with size==0 We currently only round up the userspace size to the next page. We assume that userspace hasn't made a mistake and requested a zero-length gem object and all through our internal code we then presume that every object is backed by at least a single page. Fix that oversight and report EINVAL back to userspace if they try to create a zero length object. [danvet: This fixes tests/gem_bad_length] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:19 -07:00
Daniel Vetter	6dacfd2faa	drm/i915: simplify swapin/out swizzle checking a bit Use the helper function already employed by the pwrite/pread functions. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:18 -07:00
Dave Airlie	88ef4e3f4f	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux into drm-next * 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux: Drivers: i915: Fix all space related issues.	2011-09-20 09:36:22 +01:00
Akshay Joshi	0206e353a0	Drivers: i915: Fix all space related issues. Various issues involved with the space character were generating warnings in the checkpatch.pl file. This patch removes most of those warnings. Signed-off-by: Akshay Joshi <me@akshayjoshi.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-19 18:01:47 -07:00
Rob Clark	b464e9a25c	drm/i915: use common functions for mmap offset creation Signed-off-by: Rob Clark <rob@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-08-30 11:07:00 +01:00
Keith Packard	df7976797f	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-07-22 13:40:42 -07:00
Keith Packard	f0b69efc29	drm/i915: Skip GPU wait for scanout pin while wedged Failing to pin a scanout buffer will most likely lead to a black screen, so if the GPU is wedged, then just let the pin happen and hope that things work out OK. v2: Just ignore any error from i915_gem_object_wait_rendering, as suggested by Chris Wilson Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-21 20:18:31 -07:00
Chris Wilson	e28f871165	drm/i915: Fix unfenced alignment on pre-G33 hardware Align unfenced buffers on older hardware to the power-of-two object size. The docs suggest that it should be possible to align only to a power-of-two tile height, but using the already computed fence size is easier and always correct. We also have to make sure that we unbind misaligned buffers upon tiling changes. In order to prevent a repetition of this bug, we change the interface to the alignment computation routines to force the caller to provide the requested alignment and size of the GTT binding rather than assume the current values on the object. Reported-and-tested-by: Sitosfe Wheeler <sitsofe@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36326 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-18 14:02:06 -07:00
Keith Packard	8eb2c0ee67	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-29 10:34:54 -07:00
Ben Widawsky	3e0dc6b01f	drm/i915: hangcheck disable parameter Provide a parameter to disable hanghcheck. This is useful mostly for developers trying to debug known problems, and probably should not be touched by normal users. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-29 10:32:08 -07:00
Linus Torvalds	0d72c6fcb5	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6 * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: drm/i915: Use chipset-specific irq installers drm/i915: forcewake fix after reset drm/i915: add Ivy Bridge page flip support drm/i915: split page flip queueing into per-chipset functions	2011-06-28 11:15:57 -07:00
Keith Packard	6ae77e6b6a	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-28 10:29:47 -07:00
Chris Wilson	f01c22fd59	drm/i915: Use chipset-specific irq installers Konstantin Belousov pointed out that `4697995b98` replaced the generic i915_driver_irq_install() functions with chipset specific routines accessible only through driver->irq_install(). So update the sanity check in i915_request_wait() to match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-28 10:20:06 -07:00
Hugh Dickins	e2377fe0b6	drm/i915: use shmem_truncate_range The interface to ->truncate_range is changing very slightly: once "tmpfs: take control of its truncate_range" has been applied, this can be applied. For now there is only a slight inefficiency while this remains unapplied, but it will soon become essential for managing shmem's use of swap. Change i915_gem_object_truncate() to use shmem_truncate_range() directly: which should also spare i915 later change if we switch from inode_operations->truncate_range to file_operations->fallocate. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:14 -07:00
Hugh Dickins	5949eac4d9	drm/i915: use shmem_read_mapping_page Soon tmpfs will stop supporting ->readpage and read_cache_page_gfp(): once "tmpfs: add shmem_read_mapping_page_gfp" has been applied, this patch can be applied to ease the transition. Make i915_gem_object_get_pages_gtt() use shmem_read_mapping_page_gfp() in the one place it's needed; elsewhere use shmem_read_mapping_page(), with the mapping's gfp_mask properly initialized. Forget about __GFP_COLD: since tmpfs initializes its pages with memset, asking for a cold page is counter-productive. Include linux/shmem_fs.h also in drm_gem.c: with shmem_file_setup() now declared there too, we shall remove the prototype from linux/mm.h later. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:13 -07:00
Keith Packard	b97c3d9c16	drm/i915: i915_gem_object_finish_gtt must always release gtt mmap Even if the object is no longer in the GTT domain, there may still be a user space mapping which needs to be released. Without this fix, render-based text (mostly in firefox) would occasionally get corrupted when the system was under load. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-24 21:02:59 -07:00
Keith Packard	2cd1176bd9	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-21 12:02:57 -07:00
Eric Anholt	e92d03bff9	Revert "drm/i915: Kill GTT mappings when moving from GTT domain" This reverts commit `4a684a4117`. Userland has always been required to set the object's domain to GTT before using it through a GTT mapping, it's not something that the kernel is supposed to enforce. (The pagefault support is so that we can handle multiple mappings without userland having to pin across them, not so that userland can use GTT after GPU domains without telling the kernel). Fixes 19.2% +/- 0.8% (n=6) performance regression in cairo-gl firefox-talos-gfx on my T420 latop. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-21 11:11:02 -07:00
Jesper Juhl	b65552f06c	drm/i915: Don't leak in i915_gem_shmem_pread_slow() It seems to me that we are leaking 'user_pages' in drivers/gpu/drm/i915/i915_gem.c::i915_gem_shmem_pread_slow() if read_cache_page_gfp() fails. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-06-14 11:00:54 +10:00
Eric Anholt	a187111207	drm/i915: Use the LLC mode on gen6 for everything but display. Improves full-screen openarena on my laptop 20.3% +/- 4.0% (n=3) Improves 800x600 nexuiz on my laptop 12.3% +/- 0.1% (n=3) We have more room to improve with doing LLC caching for display using GFDT, and in doing LLC+MLC caching, but this was an easy performance win and incremental improvement toward those two. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:22 -07:00
Eric Anholt	a7ef0640d9	drm/i915: Use the uncached domain for the display planes The simplest and common method for ensuring scanout coherency on all chipsets is to mark the scanout buffers as uncached (and for userspace to remember to flush the render cache every so often). We can improve upon this for later generations by marking scanout objects as GFDT and only flush those cachelines when required. However, we start simple. [v2: Move the set to uncached above the clflush. Otherwise, we'd skip the clflush and try to scan out data that was still sitting in the cache.] Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:20 -07:00
Chris Wilson	2da3b9b940	drm/i915: Combine pinning with setting to the display plane We need to perform a few operations in order to move the object into the display plane (where it can be accessed coherently by the display engine) that are important for future safety to forbid whilst pinned. As a result, we want to need to perform some of the operations before pinning, but some are required once we have been bound into the GTT. So combine the pinning performed by all the callers with set_to_display_plane(), so this complication is contained within the single function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:19 -07:00
Chris Wilson	e4ffd173a1	drm/i915: Add an interface to dynamically change the cache level [anholt v2: Don't forget that when going from cached to uncached, we haven't been tracking the write domain from the CPU perspective, since we haven't needed it for GPU coherency.] [ickle v3: We also need to make sure we relinquish any fences on older chipsets and clear the GTT for sane domain tracking.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:16 -07:00
Chris Wilson	b5ffc9bc38	drm/i915: Introduce i915_gem_object_finish_gtt() Like its siblings finish_gpu(), this function clears the object from the GTT domain forcing it to be trigger a domain invalidation should we ever need to use via the GTT again. Note that the most important side-effect of finishing the GTT domain (aside from clearing the tracking read/write domains) is that it imposes an memory barrier so that all accesses are complete before it returns, which is important if you intend to be modifying translation tables shortly afterwards. The second most important side-effect is that it tears down the GTT mappings forcing a page-fault and invalidation on next user access to the object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:14 -07:00
Chris Wilson	a8198eea15	drm/i915: Introduce i915_gem_object_finish_gpu() ... reincarnated from i915_gem_object_flush_gpu(). The semantic difference is that after calling finish_gpu() the object no longer resides in any GPU domain, and so will cause the GPU caches to be invalidated if it is ever used again. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 11:43:47 -07:00
Daniel Vetter	c8ebc2b076	drm/915: fix relaxed tiling on gen2: tile height A tile on gen2 has a size of 2kb, stride of 128 bytes and 16 rows. Userspace was broken and assumed 8 rows. Chris Wilson noted that the kernel unfortunately can't reliable check that because libdrm rounds up the size to the next bucket. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-04 10:41:12 -07:00
Chris Wilson	c8cbbb8ba9	drm/i915: s/addr & ~PAGE_MASK/offset_in_page(addr)/ Convert our open coded offset_in_page() to the common macro. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-04 10:40:42 -07:00
Ying Han	1495f230fa	vmscan: change shrinker API by passing shrink_control struct Change each shrinker's API by consolidating the existing parameters into shrink_control struct. This will simplify any further features added w/o touching each file of shrinker. [akpm@linux-foundation.org: fix build] [akpm@linux-foundation.org: fix warning] [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API] [akpm@linux-foundation.org: fix xfs warning] [akpm@linux-foundation.org: update gfs2] Signed-off-by: Ying Han <yinghan@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Minchan Kim <minchan.kim@gmail.com> Acked-by: Pavel Emelyanov <xemul@openvz.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Acked-by: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave@linux.vnet.ibm.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:26 -07:00
Eric Anholt	25aebfc30b	drm/i915: Add support for fence registers on Ivybridge. The registers are the same as on Sandybridge. Fixes scrambled display in X when it does software drawing to the GTT, and scans the results out as tiled. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 18:12:51 -07:00
Eric Anholt	10ed13e4a5	drm/i915: Use existing function instead of open-coding fence reg clear. This is once less place to miss a new INTEL_INFO(dev)->gen update now. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-13 18:12:50 -07:00
Chris Wilson	9c23f7fc4c	drm/i915: Do not clflush snooped objects Rely on the GPU snooping into the CPU cache for appropriately bound objects on MI_FLUSH. Or perhaps one day we will have a cache-coherent CPU/GPU package... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:56:44 -07:00
Chris Wilson	93dfb40cd8	drm/i915: Rename agp_type to cache_level ... to clarify just how we use it inside the driver and remove the confusion of the poorly matching agp_type names. We still need to translate through agp_type for interface into the fake AGP driver. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-05-10 13:56:43 -07:00
Chris Wilson	f6e47884e7	drm/i915: Avoid unmapping pages from a NULL address space Found by gem_stress. As we perform retirement from a workqueue, it is possible for us to free and unbind objects after the last close on the device, and so after the address space has been torn down and reset to NULL: BUG: unable to handle kernel NULL pointer dereference at 00000054 IP: [<c1295a20>] mutex_lock+0xf/0x27 *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /sys/module/vt/parameters/default_utf8 Pid: 5, comm: kworker/u:0 Not tainted 2.6.38+ #214 EIP: 0060:[<c1295a20>] EFLAGS: 00010206 CPU: 1 EIP is at mutex_lock+0xf/0x27 EAX: 00000054 EBX: 00000054 ECX: 00000000 EDX: 00012fff ESI: 00000028 EDI: 00000000 EBP: f706fe20 ESP: f706fe18 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kworker/u:0 (pid: 5, ti=f706e000 task=f7060d00 task.ti=f706e000) Stack: f5aa3c60 00000000 f706fe74 c107e7df 00000246 dea55380 00000054 f5aa3c60 f706fe44 00000061 f70b4000 c13fff84 00000008 f706fe54 00000000 00000000 00012f00 00012fff 00000028 c109e575 f6b36700 00100000 00000000 f706fe90 Call Trace: [<c107e7df>] unmap_mapping_range+0x7d/0x1e6 [<c109e575>] ? mntput_no_expire+0x52/0xb6 [<c11c12f6>] i915_gem_release_mmap+0x49/0x58 [<c11c3449>] i915_gem_object_unbind+0x4c/0x125 [<c11c353f>] i915_gem_free_object_tail+0x1d/0xdb [<c11c55a2>] i915_gem_free_object+0x3d/0x41 [<c11a6be2>] ? drm_gem_object_free+0x0/0x27 [<c11a6c07>] drm_gem_object_free+0x25/0x27 [<c113c3ca>] kref_put+0x39/0x42 [<c11c0a59>] drm_gem_object_unreference+0x16/0x18 [<c11c0b15>] i915_gem_object_move_to_inactive+0xba/0xbe [<c11c0c87>] i915_gem_retire_requests_ring+0x16e/0x1a5 [<c11c3645>] i915_gem_retire_requests+0x48/0x63 [<c11c36ac>] i915_gem_retire_work_handler+0x4c/0x117 [<c10385d1>] process_one_work+0x140/0x21b [<c103734c>] ? __need_more_worker+0x13/0x2a [<c10373b1>] ? need_to_create_worker+0x1c/0x35 [<c11c3660>] ? i915_gem_retire_work_handler+0x0/0x117 [<c1038faf>] worker_thread+0xd4/0x14b [<c1038edb>] ? worker_thread+0x0/0x14b [<c103be1b>] kthread+0x68/0x6d [<c103bdb3>] ? kthread+0x0/0x6d [<c12970f6>] kernel_thread_helper+0x6/0x10 Code: 00 e8 98 fe ff ff 5d c3 55 89 e5 3e 8d 74 26 00 ba 01 00 00 00 e8 84 fe ff ff 5d c3 55 89 e5 53 8d 64 24 fc 3e 8d 74 26 00 89 c3 <f0> ff 08 79 05 e8 ab ff ff ff 89 e0 25 00 e0 ff ff 89 43 10 58 EIP: [<c1295a20>] mutex_lock+0xf/0x27 SS:ESP 0068:f706fe18 CR2: 0000000000000054 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>	2011-03-23 09:17:03 +00:00
Chris Wilson	26e12f8943	drm/i915: Fix use after free within tracepoint Detected by scripts/coccinelle/free/kfree.cocci. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Keith Packard <keithp@keithp.com>	2011-03-23 09:17:02 +00:00
Chris Wilson	36d527dead	drm/i915: Restore missing command flush before interrupt on BLT ring We always skipped flushing the BLT ring if the request flush did not include the RENDER domain. However, this neglects that we try to flush the COMMAND domain after every batch and before the breadcrumb interrupt (to make sure the batch is indeed completed prior to the interrupt firing and so insuring CPU coherency). As a result of the missing flush, incoherency did indeed creep in, most notable when using lots of command buffers and so potentially rewritting an active command buffer (i.e. the GPU was still executing from it even though the following interrupt had already fired and the request/buffer retired). As all ring->flush routines now have the same preconditions, de-duplicate and move those checks up into i915_gem_flush_ring(). Fixes gem_linear_blit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35284 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: mengmeng.meng@intel.com	2011-03-23 09:17:01 +00:00
Chris Wilson	ed0291fd16	drm/i915: Fix computation of pitch for dumb bo creator Cc: Dave Airlie <airlied@linux.ie> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-23 09:17:00 +00:00
Chris Wilson	29c5a58728	drm/i915: Fix tiling corruption from pipelined fencing ... even though it was disabled. A mistake in the handling of fence reuse caused us to skip the vital delay of waiting for the object to finish rendering before changing the register. This resulted in us changing the fence register whilst the bo was active and so causing the blits to complete using the wrong stride or even the wrong tiling. (Visually the effect is that small blocks of the screen look like they have been interlaced). The fix is to wait for the GPU to finish using the memory region pointed to by the fence before changing it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34584 Cc: Andy Whitcroft <apw@canonical.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> [Note for 2.6.38-stable, we need to reintroduce the interruptible passing] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Dave Airlie <airlied@linux.ie>	2011-03-23 09:12:24 +00:00
Herton Ronaldo Krzesinski	09bfa51773	drm/i915: Prevent racy removal of request from client list When i915_gem_retire_requests_ring calls i915_gem_request_remove_from_client, the client_list for that request may already be removed in i915_gem_release. So we may call twice list_del(&request->client_list), resulting in an oops like this report: [126167.230394] BUG: unable to handle kernel paging request at 00100104 [126167.230699] IP: [<f8c2ce44>] i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.231042] pdpt = 00000000314c1001 pde = 0000000000000000 [126167.231314] Oops: 0002 [#1] SMP [126167.231471] last sysfs file: /sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT1/current_now [126167.231901] Modules linked in: snd_seq_dummy nls_utf8 isofs btrfs zlib_deflate libcrc32c ufs qnx4 hfsplus hfs minix ntfs vfat msdos fat jfs xfs exportfs reiserfs cryptd aes_i586 aes_generic binfmt_misc vboxnetadp vboxnetflt vboxdrv parport_pc ppdev snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep arc4 snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq uvcvideo videodev snd_timer snd_seq_device joydev iwlagn iwlcore mac80211 snd cfg80211 soundcore i915 drm_kms_helper snd_page_alloc psmouse drm serio_raw i2c_algo_bit video lp parport usbhid hid sky2 sdhci_pci ahci sdhci libahci [126167.232018] [126167.232018] Pid: 1101, comm: Xorg Not tainted 2.6.38-6-generic-pae #34-Ubuntu Gateway MC7833U / [126167.232018] EIP: 0060:[<f8c2ce44>] EFLAGS: 00213246 CPU: 0 [126167.232018] EIP is at i915_gem_retire_requests_ring+0xd4/0x240 [i915] [126167.232018] EAX: 00200200 EBX: f1ac25b0 ECX: 00000040 EDX: 00100100 [126167.232018] ESI: f1a2801c EDI: e87fc060 EBP: ef4d7dd8 ESP: ef4d7db0 [126167.232018] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [126167.232018] Process Xorg (pid: 1101, ti=ef4d6000 task=f1ba6500 task.ti=ef4d6000) [126167.232018] Stack: [126167.232018] f1a28000 f1a2809c f1a28094 0058bd97 f1aa2400 f1a2801c 0058bd7b 0058bd85 [126167.232018] f1a2801c f1a28000 ef4d7e38 f8c2e995 ef4d7e30 ef4d7e60 c14d1ebc f6b3a040 [126167.232018] f1522cc0 000000db 00000000 f1ba6500 ffffffa1 00000000 00000001 f1a29214 [126167.232018] Call Trace: Unfortunately the call trace reported was cut, but looking at debug symbols the crash is at __list_del, when probably list_del is called twice on the same request->client_list, as the dereferenced value is LIST_POISON1 + 4, and by looking more at the debug symbols before list_del call it should have being called by i915_gem_request_remove_from_client And as I can see in the code, it seems we indeed have the possibility to remove a request->client_list twice, which would cause the above, because we do list_del(&request->client_list) on both i915_gem_request_remove_from_client and i915_gem_release As Chris Wilson pointed out, it's indeed the case: "(...) I had thought that the actual insertion/deletion was serialised under the struct mutex and the intention of the spinlock was to protect the unlocked list traversal during throttling. However, I missed that i915_gem_release() is also called without struct mutex and so we do need the double check for i915_gem_request_remove_from_client()." This change does the required check to avoid the duplicate remove of request->client_list. Bugzilla: http://bugs.launchpad.net/bugs/733780 Cc: stable@kernel.org # 2.6.38 Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-23 06:41:12 +00:00
Dave Airlie	34db18abd3	Merge remote branch 'intel/drm-intel-next' of ../drm-next into drm-core-next * 'intel/drm-intel-next' of ../drm-next: (755 commits) drm/i915: Only wait on a pending flip if we intend to write to the buffer drm/i915/dp: Sanity check eDP existence drm/i915: Rebind the buffer if its alignment constraints changes with tiling drm/i915: Disable GPU semaphores by default drm/i915: Do not overflow the MMADDR write FIFO Revert "drm/i915: fix corruptions on i8xx due to relaxed fencing" drm/i915: Don't save/restore hardware status page address register drm/i915: don't store the reg value for HWS_PGA drm/i915: fix memory corruption with GM965 and >4GB RAM Linux 2.6.38-rc7 Revert "TPM: Long default timeout fix" drm/i915: Re-enable GPU semaphores for SandyBridge mobile drm/i915: Replace vblank PM QoS with "Interrupt-Based AGPBUSY#" Revert "drm/i915: Use PM QoS to prevent C-State starvation of gen3 GPU" drm/i915: Allow relocation deltas outside of target bo drm/i915: Silence an innocuous compiler warning for an unused variable fs/block_dev.c: fix new kernel-doc warning ACPI: Fix build for CONFIG_NET unset mm: <asm-generic/pgtable.h> must include <linux/mm_types.h> x86: Use u32 instead of long to set reset vector back to 0 ... Conflicts: drivers/gpu/drm/i915/i915_gem.c	2011-03-14 14:15:13 +10:00
Chris Wilson	47ae63e0c2	Merge branch 'drm-intel-fixes' into drm-intel-next Apply the trivial conflicting regression fixes, but keep GPU semaphores enabled. Conflicts: drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/i915/i915_gem_execbuffer.c	2011-03-07 12:35:15 +00:00
Chris Wilson	467cffba85	drm/i915: Rebind the buffer if its alignment constraints changes with tiling Early gen3 and gen2 chipset do not have the relaxed per-surface tiling constraints of the later chipsets, so we need to check that the GTT alignment is correct for the new tiling. If it is not, we need to rebind. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-03-07 11:02:16 +00:00
Chris Wilson	ce453d81cb	drm/i915: Use a device flag for non-interruptible phases The code paths for modesetting are growing in complexity as we may need to move the buffers around in order to fit the scanout in the aperture. Therefore we face a choice as to whether to thread the interruptible status through the entire pinning and unbinding code paths or to add a flag to the device when we may not be interrupted by a signal. This does the latter and so fixes a few instances of modesetting failures under stress. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-22 15:56:25 +00:00
Chris Wilson	c872522663	drm/i915: Protect against drm_gem_object not being the first member Dave Airlie spotted that we had a potential bug should we ever rearrange the drm_i915_gem_object so not the base drm_gem_object was not its first member. He noticed that we often convert the return of drm_gem_object_lookup() immediately into drm_i915_gem_object and then check the result for nullity. This is only valid when the base object is the first member and so the superobject has the same address. Play safe instead and use the compiler to convert back to the original return address for sanity testing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-22 15:55:57 +00:00
Chris Wilson	bed636abea	drm/i915: i915_mutex_interruptible() returns -EINTR ... so we handle that for i915_gem_fault() in the same manner as ERESTARTSYS, or we send a SIGBUS to the faulting application. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-11 20:32:44 +00:00
Chris Wilson	8d7e3de1e0	drm/i915: Skip the no-op domain changes when already in CPU\|GTT domains Removes some superfluous fluff from tracing... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 15:24:03 +00:00
Chris Wilson	db53a30261	drm/i915: Refine tracepoints A lot of minor tweaks to fix the tracepoints, improve the outputting for ftrace, and to generally make the tracepoints useful again. It is a start and enough to begin identifying performance issues and gaps in our coverage. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:59:18 +00:00
Chris Wilson	d9bc7e9f32	drm/i915: Fix infinite loop regression from `21dd3734` By returning EAGAIN upon a wedged GPU before attempting to wait, we would hit an infinite loop of repeating operation without ever progressing. Instead this needs to be EIO so that userspace knows that the GPU is truly wedged and not in the process of error recovery. Similarly, we need to handle the error recovery during i915_gem_fault. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-02-07 14:33:55 +00:00
Dave Airlie	ff72145bad	drm: dumb scanout create/mmap for intel/radeon (v3) This is just an idea that might or might not be a good idea, it basically adds two ioctls to create a dumb and map a dumb buffer suitable for scanout. The handle can be passed to the KMS ioctls to create a framebuffer. It looks to me like it would be useful in the following cases: a) in development drivers - we can always provide a shadowfb fallback. b) libkms users - we can clean up libkms a lot and avoid linking to libdrm_*. c) plymouth via libkms is a lot easier. Userspace bits would be just calls + mmaps. We could probably mark these handles somehow as not being suitable for acceleartion so as top stop people who are dumber than dumb. Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-02-07 12:16:14 +10:00
Chris Wilson	21dd373486	drm/i915: Defer reporting EIO until we try to use the GPU Instead of reporting EIO upfront in the entrance of an ioctl that may or may not attempt to use the GPU, defer the actual detection of an invalid ioctl to when we issue a GPU instruction. This allows us to continue to use bo in video memory (via pread/pwrite and mmap) after the GPU has hung. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-27 11:06:07 +00:00
Chris Wilson	e110e8d672	drm/i915: Check wedged status before throttling Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-27 11:05:51 +00:00
Chris Wilson	29ee399131	drm/i915: Silence a few -Wunused-but-set-variable Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-25 10:33:11 +00:00
Chris Wilson	bee4a186c1	drm/i915,agp/intel: Do not clear stolen entries We can only utilize the stolen portion of the GTT if we are in sole charge of the hardware. This is only true if using GEM and KMS, otherwise VESA continues to access stolen memory. Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Jiri Olsa <jolsa@redhat.com> Tested-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-24 18:26:25 +00:00
Chris Wilson	076e2c0eb8	drm/i915: Fix use of invalid array size for ring->sync_seqno There are I915_NUM_RINGS-1 inter-ring synchronisation counters, but we were clearing I915_NUM_RINGS of them. Oops. Reported-by: Jiri Slaby <jirislaby@gmail.com> Tested-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-23 12:52:11 +00:00
Chris Wilson	809b63349c	drm/i915: If we hit OOM when allocating GTT pages, clear the aperture Rather than evicting an object at random, which is unlikely to alleviate the memory pressure sufficient to allow us to continue, zap the entire aperture. That should give the system long enough to recover and reap some pages from the evicted objects, forestalling the allocation error for the new object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 22:55:48 +00:00
Chris Wilson	0a58705b2f	drm/i915: Periodically flush the active lists and requests In order to retire active buffers whilst no client is active, we need to insert our own flush requests onto the ring. This is useful for servers that queue up some rendering and then go to sleep as it allows us to the complete processing of those requests, potentially making that memory available again much earlier. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 22:15:30 +00:00
Chris Wilson	882417851a	drm/i915: Propagate error from flushing the ring ... in order to avoid a BUG() and potential unbounded waits. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:44:50 +00:00
Chris Wilson	b72f3acb71	drm/i915: Handle ringbuffer stalls when flushing Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:43:55 +00:00
Chris Wilson	63256ec534	drm/i915: Enforce write ordering through the GTT We need to ensure that writes through the GTT land before any modification to the MMIO registers and so must impose a mandatory write barrier when flushing the GTT domain. This was revealed by relaxing the write ordering by experimentally mapping the registers and the GATT as write-combining. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2011-01-11 20:42:53 +00:00
Chris Wilson	72bfa19c8d	drm/i915: Allow the application to choose the constant addressing mode The relative-to-general state default is useless as it means having to rewrite the streaming kernels for each batch. Relative-to-surface is more useful, as that stream usually needs to be rewritten for each batch. And absolute addressing mode, vital if you start streaming state, is also only available by adjusting the register... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-20 09:41:36 +00:00
Chris Wilson	b5ba177d8d	drm/i915: Poll for seqno completion if IRQ is disabled Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32288 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-14 12:19:25 +00:00
Chris Wilson	b13c2b96bf	drm/i915/ringbuffer: Make IRQ refcnting atomic In order to enforce the correct memory barriers for irq get/put, we need to perform the actual counting using atomic operations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-14 11:34:46 +00:00
Chris Wilson	1a1c69762a	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_dp.c	2010-12-07 23:02:08 +00:00
Chris Wilson	7a1948768c	drm/i915: Emit a request to clear a flushed and idle ring for unbusy bo In order for bos to retire eventually, a request must be sent down the ring. This is expected, for example, by occlusion queries for which mesa will wait upon (whilst running glean) before issuing more batches and so the normal activity upon the ring is suspended and we need to emit a request to clear the idle ring. Reported-by: Jinjin, Wang <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-07 10:59:14 +00:00
Chris Wilson	0be732841f	drm/i915: Wait for the bo if a display flip is pipelined on the other ring Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-06 14:37:27 +00:00
Chris Wilson	0ac74c6b33	drm/i915: Only emit a flush if there is an outstanding gpu write Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-06 14:36:02 +00:00
Chris Wilson	6bda10d152	drm/i915: Completely disable fence pipelining. I'm still seeing tiling corruption of PutImage and CopyArea (I think) under mutter on pnv, so obviously the pipelining logic is deeply flawed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-05 23:19:37 +00:00
Chris Wilson	1ec14ad313	drm/i915: Implement GPU semaphores for inter-ring synchronisation on SNB The bulk of the change is to convert the growing list of rings into an array so that the relationship between the rings and the semaphore sync registers can be easily computed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-05 00:37:38 +00:00
Chris Wilson	60de2ba51e	drm/i915: Kill the get_fence tracepoint As the tracepoint is now decoupled from when the actual register is assigned and was never complemented by detailing when the object lost its fence, it has outlived its limited usefulness. Profiling the actual stalls is a far more profitable venture anyway. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:20:47 +00:00
Chris Wilson	c6748e09ee	drm/i915: Remove inactive LRU tracking from set_domain_ioctl As the userspace mappings are torn down on every GPU write, we prefer to track when the buffer is activated (via a fresh i915_gem_fault). This makes the LRU conceptually simpler. With coherent mappings, the remaining use-case for set_domain_ioctl is GPU synchronisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:16:30 +00:00
Chris Wilson	d9e86c0ee6	drm/i915: Pipelined fencing [infrastructure] With this change, every batchbuffer can use all available fences (save pinned and scanout, of course) without ever stalling the gpu! In theory. Currently the actual pipelined update of the register is disabled due to some stability issues. However, just the deferred update is a significant win. Based on a series of patches by Daniel Vetter. The premise is that before every access to a buffer through the GTT we have to declare whether we need a register or not. If the access is by the GPU, a pipelined update to the register is made via the ringbuffer, and we track the last seqno of the batches that access it. If by the CPU we wait for the last GPU access and update the register (either to clear or to set it for the current buffer). One advantage of being able to pipeline changes is that we can defer the actual updating of the fence register until we first need to access the object through the GTT, i.e. we can eliminate the stall on set_tiling. This is important as the userspace bo cache does not track the tiling status of active buffers which generate frequent stalls on gen3 when enabling tiling for an already bound buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-12-02 10:07:05 +00:00
Chris Wilson	87ca9c8a7e	drm/i915: Prevent stalling for a GTT read back from a read-only GPU target Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-12-02 10:00:15 +00:00
Chris Wilson	7d2cb39c33	drm/i915: Release fenced GTT mapping on suspend ... so that upon first use after resume we will reacquire the fence reg. Reported-by: Keith Packard <keithp@keithp.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-28 16:12:15 +00:00
Chris Wilson	3619df035e	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c	2010-11-28 15:37:17 +00:00
Daniel Vetter	de18a29e0f	drm/i915: fix regression due to `ba3d8d749b` We don't track gpu flush request in any special way. So even with obj->write_domain == 0, a gpu flush might be outstanding but no yet executed. Even worse, the latest request might use the object only for reading. So and unconditional call to object_wait_rendering is needed for !pipelined. Hence revert that patch fully and untangle the flushing from the synchronization again. Reported-by: Keith Packard <keithp@keithp.com> Tested-by: Keith Packard <keithp@keithp.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-28 09:05:12 +00:00
Chris Wilson	432e58edc9	drm/i915: Avoid allocation for execbuffer object list Besides the minimal improvement in reducing the execbuffer overhead, the real benefit is clarifying a few routines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 21:19:26 +00:00
Chris Wilson	54cf91dc4e	drm/i915: Split i915_gem_execbuffer into its own file. A number of dragons have been seen lurking within the execbuffer code. The first step is then to isolate them from the rest and begin to scrutinise them in depth. Suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 21:19:25 +00:00
Chris Wilson	6299f992c0	drm/i915: Defer accounting until read from debugfs Simply remove our accounting of objects inside the aperture, keeping only track of what is in the aperture and its current usage. This removes the over-complication of BUGs that were attempting to keep the accounting correct and also removes the overhead of the accounting on the hot-paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:04:53 +00:00
Chris Wilson	2021746e1d	drm/i915: Mark a few functions as __must_check ... to benefit from the compiler checking that we remember to handle and propagate errors. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:04:04 +00:00
Chris Wilson	312817a39f	drm/i915: Only save and restore fences for UMS With KMS, we can simply relinquish the fence when we idle the GPU and reassign it upon first use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:03:22 +00:00
Daniel Vetter	c6642782b9	drm/i915: Add a mechanism for pipelining fence register updates Not employed just yet... Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-25 15:01:39 +00:00
Chris Wilson	caea7476d4	drm/i915: More accurately track last fence usage by the GPU Based on a patch by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:52 +00:00
Chris Wilson	a7a09aebe8	drm/i915: Rework execbuffer pinning Avoid evicting buffers that will be used later in the batch in order to make room for the initial buffers by pinning all bound buffers in a single pass before binding (and evicting for) fresh buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-24 13:30:51 +00:00
Chris Wilson	919926aeb3	drm/i915: Thread the pipelining ring through the callers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:16 +00:00
Chris Wilson	dddbc0e525	drm/i915: Remove a defunct BUG_ON This used to check the precondition that all fences were to be located in a mappable area, redundant now as those two parameters are combined into one. After pinning, we assert that the buffer is bound into the desired region. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:15 +00:00
Chris Wilson	b6913e4bdb	drm/i915: Move the implementation details of PIPE_CONTROL to the ringbuffer The pipe control object is allocated by the device for the sole use of the render ringbuffer. Move this detail from the general code to the render ring buffer initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:14 +00:00
Chris Wilson	92b88aeb1a	drm/i915: Not all mappable regions require GTT fence regions Combining map_and_fenceable revealed a bug in i915_gem_object_gtt_size() in that it always computed the appropriate fence size for the object regardless of tiling state which caused us to over-allocate linear buffers when binding to the GTT. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:13 +00:00
Chris Wilson	05394f3975	drm/i915: Use drm_i915_gem_object as the preferred type A glorified s/obj_priv/obj/ with a net reduction of over a 100 lines and many characters! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:19:10 +00:00
Daniel Vetter	7c2e6fdf45	drm/i915: move gtt handling to i915_gem_gtt.c No more drm_*_agp in i915_gem.c! Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:47 +00:00
Daniel Vetter	93a37f20ea	drm/i915: track objects in the gtt This is required to restore gtt mappings on resume when agp is gone. The right way to do this would be to make sturct drm_mm_node embeddable and use the allocation list maintained by the drm memory manager. But that's a bigger project. Getting rid of the per bo agp_mem will save more memory than this wastes, anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:45 +00:00
Daniel Vetter	40ce657510	drm/i915/gtt: call chipset flush directly Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:44 +00:00
Daniel Vetter	23ed992a5e	drm/i915\|intel-gtt: consolidate intel-gtt.h headers ... and a few other defines. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-23 20:14:43 +00:00
Chris Wilson	e384eafc1c	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-11-23 20:13:13 +00:00
Chris Wilson	bcf50e2775	drm/i915: Handle pagefaults in execbuffer user relocations Currently if we hit a pagefault when applying a user relocation for the execbuffer, we bail and return EFAULT to the application. Instead, we need to unwind, drop the dev->struct_mutex, copy all the relocation entries to a vmalloc array (to avoid any potential circular deadlocks when resolving the pagefault), retake the mutex and then apply the relocations. Afterwards, we need to again drop the lock and copy the vmalloc array back to userspace. v2: Incorporate feedback from Daniel Vetter. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-23 20:11:43 +00:00
Chris Wilson	e624ae8e0d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c	2010-11-22 08:51:36 +00:00
Chris Wilson	d1d788302e	drm/i915: Prevent integer overflow when validating the execbuffer Commit `2549d6c2` removed the vmalloc used for temporary storage of the relocation lists used during execbuffer. However, our use of vmalloc was being protected by an integer overflow check which we do want to preserve! Reported-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-21 09:30:58 +00:00
Chris Wilson	51311d0a5c	drm/i915: Do not hold mutex when faulting in user addresses Linus Torvalds found that it was rather trivial to trigger a system freeze: In fact, with lockdep, I don't even need to do the sysrq-d thing: it shows the bug as it happens. It's the X server taking the same lock recursively. Here's the problem: ============================================= [ INFO: possible recursive locking detected ] 2.6.37-rc2-00012-gbdbd01a #7 --------------------------------------------- Xorg/2816 is trying to acquire lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c626c>] i915_gem_fault+0x50/0x17e but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a other info that might help us debug this: 2 locks held by Xorg/2816: #0: (&dev->struct_mutex){+.+.+.}, at: [<ffffffff812c403b>] i915_mutex_lock_interruptible+0x28/0x4a #1: (&mm->mmap_sem){++++++}, at: [<ffffffff81022d4f>] page_fault+0x156/0x37b This recursion was introduced by rearranging the locking to avoid the double locking on the fast path (4f27b5d and `fbd5a26d`) and the introduction of the prefault to encourage the fast paths (b5e4f2b). In order to undo the problem, we rearrange the code to perform the access validation upfront, attempt to prefault and then fight for control of the mutex. the best case scenario where the mutex is uncontended the prefaulting is not wasted. Reported-and-tested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-19 09:30:15 +00:00
Chris Wilson	c94f28c383	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/intel_ringbuffer.c	2010-11-15 06:49:30 +00:00
Chris Wilson	1bb95834bb	Merge remote branch 'airlied/drm-fixes' into drm-intel-fixes	2010-11-15 06:33:11 +00:00
Daniel Vetter	5e78330126	drm/i915: fix relaxed tiling for gen <= 3 && !g33 g33/pineview doesn't have any alignment constrains for unfenced tiled buffers. But older chips have. Fix this. Problem introduced in `a00b10c360`. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-11-15 05:22:16 +00:00
Chris Wilson	85345517fe	drm/i915: Retire any pending operations on the old scanout when switching An old and oft reported bug, is that of the GPU hanging on a MI_WAIT_FOR_EVENT following a mode switch. The cause is that the GPU is waiting on a scanline counter on an inactive pipe, and so waits for a very long time until eventually the user reboots his machine. We can prevent this either by moving the WAIT into the kernel and thereby incurring considerable cost on every swapbuffers, or by waiting for the GPU to retire the last batch that accesses the framebuffer before installing a new one. As mode switches are much rarer than swap buffers, this looks like an easy choice. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28964 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29252 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-13 09:49:11 +00:00
Chris Wilson	5d97eb69bd	drm/i915: Only add the lazy request if we end up waiting for it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-10 20:41:16 +00:00
Joe Perches	fce7d61be0	drivers/gpu/drm: Update WARN uses Coalesce long formats. Align arguments. Add missing newlines. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-11-09 13:37:15 +10:00
Chris Wilson	b47b30ccda	drm/i915: Avoid might_fault during pwrite whilst holding our mutex ... and so prevent a potential circular reference: [ INFO: possible circular locking dependency detected ] 2.6.37-rc1-uwe1+ #4 ------------------------------------------------------- Xorg/1401 is trying to acquire lock: (&mm->mmap_sem){++++++}, at: [<c01e4ddb>] might_fault+0x4b/0xa0 but task is already holding lock: (&dev->struct_mutex){+.+.+.}, at: [<f869c3ac>] i915_mutex_lock_interruptible+0x3c/0x60 [i915] which lock already depends on the new lock. When the locking around the pwrite ioctl was simplified, I did not spot that the phys path never took any locks and so we introduced this potential circular reference. Reported-by: Uwe Helm <uwe.helm@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-08 09:19:11 +00:00
Chris Wilson	045e769ab6	drm/i915: Handle GPU hangs during fault gracefully. Instead of killing the process, just return no page found and reschedule the process giving the GPU some time to (hopefully) recover. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-07 09:18:22 +00:00
Daniel Vetter	75e9e9158f	drm/i915: kill mappable/fenceable disdinction `a00b10c360` "Only enforce fence limits inside the GTT" also added a fenceable/mappable disdinction when binding/pinning buffers. This only complicates the code with no pratical gain: - In execbuffer this matters on for g33/pineview, as this is the only chip that needs fences and has an unmappable gtt area. But fences are only possible in the mappable part of the gtt, so need_fence implies need_mappable. And need_mappable is only set independantly with relocations which implies (for sane userspace) that the buffer is untiled. - The overlay code is only really used on i8xx, which doesn't have unmappable gtt. And it doesn't support tiled buffers, currently. - For all other buffers it's a bug to pass in a tiled bo. In short, this disdinction doesn't have any practical gain. I've also reverted mapping the overlay and context pages as possibly unmappable. It's not worth being overtly clever here, all the big gains from unmappable are for execbuf bos. Also add a comment for a clever optimization that confused me while reading the original patch by Chris Wilson. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-04 19:02:03 +00:00
Chris Wilson	085ce26437	drm/i915: Ensure that if we ever try to pin+fence it is mappable. When merging Daniel's full-gtt patches I had a set of tweaks which I thought I had undone. I was half right... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31286 Reported-by: jinjin.wang@intel.com Reported-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-03 09:31:57 +00:00
Chris Wilson	f2a630bfec	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem.c drivers/gpu/drm/i915/i915_gem_evict.c	2010-11-01 13:44:41 +00:00
Chris Wilson	c6afd65807	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-11-01 13:39:24 +00:00
Chris Wilson	0f8c6d7ca9	drm/i915: Move the invalidate\|flush information out of the device struct ... and into a local structure scoped for the single function in which it is used. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:38:44 +00:00
Chris Wilson	13b2928933	drm/i915: Apply big hammer to serialise buffer access between rings Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-11-01 12:31:19 +00:00
Chris Wilson	5eac3ab459	drm/i915: Evict just the purgeable GTT entries on the first pass Take two passes to evict everything whilst searching for sufficient free space to bind the batchbuffer. After searching for sufficient free space using LRU eviction, evict everything that is purgeable and try again. Only then if there is insufficient free space (or the GTT is too badly fragmented) evict everything from the aperture and try one last time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-31 12:31:30 +00:00
Chris Wilson	ff75b9bc48	drm/i915: Fix typo from `e5281ccd` in i915_gem_attach_phys_object() Accessing the uninitialised obj->pages instead of the local page lead to an OOPs. Reported-by: Xavier Chantry <chantry.xavier@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-30 22:52:31 +01:00
Chris Wilson	872d860c85	drm/i915: Remove the duplicate domain-change tracepoint for GPU flush Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:54 +01:00
Chris Wilson	a00b10c360	drm/i915: Only enforce fence limits inside the GTT. So long as we adhere to the fence registers rules for alignment and no overlaps (including with unfenced accesses to linear memory) and account for the tiled access in our size allocation, we do not have to allocate the full fenced region for the object. This allows us to fight the bloat tiling imposed on pre-i965 chipsets and frees up RAM for real use. [Inside the GTT we still suffer the additional alignment constraints, so it doesn't magic allow us to render larger scenes without stalls -- we need the expanded GTT and fence pipelining to overcome those...] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 11:15:07 +01:00
Chris Wilson	7465378fd7	drm/i915: Convert BUG_ON(pin_count) from an impossible condition Also spotted by Dan Carpenter. obj->pin_count is unsigned so the BUG_ON(obj->pin_count<0) will never trigger. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-29 10:54:29 +01:00
Chris Wilson	bbe2e11a4b	drm/i915: Do not return -1 from shrinker when nr_to_scan == 0 The error code is only expected during the actual pruning and not during the first measurement (nr_to_scan == 0) pass. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 22:35:07 +01:00
Chris Wilson	395b70be54	drm/i915: Flush read-only buffers from the active list upon idle as well It is possible for the active list to only contain a read-only buffer so that the ring->gpu_write_list remains entry. This leads to an inconsistency between i915_gpu_is_active() and i915_gpu_idle() causing an infinite spin during the shrinker and an assertion failure that i915_gpu_idle() does indeed flush all buffers from the active lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 21:31:19 +01:00
Chris Wilson	4a684a4117	drm/i915: Kill GTT mappings when moving from GTT domain In order to force a page-fault on a GTT mapping after we start using it from the GPU and so enforce correct CPU/GPU synchronisation, we need to invalidate the mapping. Pointed out by Owain G. Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:03 +01:00
Chris Wilson	e5281ccd2e	drm/i915: Eliminate nested get/put pages By using read_cache_page() for individual pages during pwrite/pread we can eliminate an unnecessary large allocation (and immediate free) of obj->pages. Also this eliminates any potential nesting of get/put pages, simplifying the code and preparing the path for greater things. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	39a01d1fb6	drm/i915: Remove mmap_offset Since we rarely use the mmap_offset and it is easily computable from the obj->map_list.hash, remove it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:02 +01:00
Chris Wilson	17250b7155	drm/i915: Make the inactive object shrinker per-device Eliminate the racy device unload by embedding a shrinker into each device. Smaller, simpler code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-28 20:55:01 +01:00
Chris Wilson	da761a6edf	drm/i915: Bail early if we try to mmap an object too large to be mapped. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	fb7d516af1	drm/i915: add accounting for mappable objects in gtt v2 More precisely: For those that _need_ to be mappable. Also add two BUG_ONs in fault and pin to check the consistency of the mappable flag. Changes in v2: - Add tracking of gtt mappable space (to notice mappable/unmappable balancing issues). - Improve the mappable working set tracking by tracking fault and pin separately. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:08 +01:00
Daniel Vetter	ec57d2602a	drm/i915: add mappable to gem_object_bind tracepoint This way we can make some more educated guesses as to why exactly we can't use 2G apertures to their full potential ;) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:07 +01:00
Daniel Vetter	53984635a6	drm/i915: use the complete gtt At least the part that's currently enabled by the BIOS. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:06 +01:00
Daniel Vetter	16e809acc1	drm/i915: unbind unmappable objects on fault/pin In i915_gem_object_pin obviously unbind only if mappable is true. This is the last part to enable gtt_mappable_end != gtt_size, which the next patch will do. v2: Fences on g33/pineview only work in the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	920afa77ce	drm/i915: range-restricted bind_to_gtt Like before add a parameter mappable (also to gem_object_pin) and set it depending upon the context. Only bos that are brought into the gtt due to an execbuffer call can be put into the unmappable part of the gtt, everything else (especially pinned objects) need to be put into the mappable part of the gtt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:05 +01:00
Daniel Vetter	a6e0aa4214	drm/i915: range-restricted eviction support Add a mappable parameter to i915_gem_evict_something to distinguish the two cases (non-restricted vs. mappable gtt allocations). No functional changes because the mappable limit is set to the end of the gtt currently. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:04 +01:00
Chris Wilson	3cce469cab	drm/i915: Propagate error from failing to queue a request Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:31:03 +01:00
Chris Wilson	b2223497b4	drm/i915: Remove the confusing global waiting/irq seqno Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:30:59 +01:00
Chris Wilson	7e318e18f2	drm/i915: Move object to GPU domains after dispatching execbuffer In the event that we fail to dispatch the execbuffer, for example if there is insufficient space on the ring, we were leaving the objects in an inconsistent state. Notably they were marked as being in the GPU write domain, but were not added to the ring or any list. This would lead to inevitable oops: [ 1010.522940] [drm:i915_gem_do_execbuffer] ERROR dispatch failed -16 [ 1010.523055] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088 [ 1010.523097] IP: [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523120] PGD 14cf2f067 PUD 14ce04067 PMD 0 [ 1010.523140] Oops: 0000 [#1] SMP [ 1010.523154] last sysfs file: /sys/devices/virtual/vc/vcsa2/uevent [ 1010.523173] CPU 0 [ 1010.523183] Pid: 716, comm: X Not tainted 2.6.36+ #34 LosLunas CRB/SandyBridge Platform [ 1010.523206] RIP: 0010:[<ffffffff8122d006>] [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523233] RSP: 0018:ffff88014bf97cd8 EFLAGS: 00010296 [ 1010.523249] RAX: ffff88014e2d1808 RBX: 0000000000000000 RCX: 0000000000000000 [ 1010.523270] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000 [ 1010.523290] RBP: ffff88014e2d1000 R08: 0000000000000002 R09: 00000000400c645f [ 1010.523311] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 [ 1010.523331] R13: ffff88014e29a000 R14: 00000000000000c8 R15: ffffffff8162eb28 [ 1010.523352] FS: 00007fc62379d700(0000) GS:ffff88001fc00000(0000) knlGS:0000000000000000 [ 1010.523375] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1010.523392] CR2: 0000000000000088 CR3: 000000014bf87000 CR4: 00000000000406f0 [ 1010.523412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1010.523433] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1010.523454] Process X (pid: 716, threadinfo ffff88014bf96000, task ffff88014cc1ee40) [ 1010.523475] Stack: [ 1010.523483] ffff88014d5199c0 0000000000000200 0000000000000000 ffff88014bcc6400 [ 1010.523509] <0> 0000000000000000 0000000000000001 ffff88014e29a000 ffff88014bcc6400 [ 1010.523537] <0> ffffffff8162eb28 ffffffff8122faa8 ffff88014e29a000 ffff88014bcc6400 [ 1010.523568] Call Trace: [ 1010.523578] [<ffffffff8122faa8>] ? i915_gem_object_flush_gpu_write_domain+0x48/0x80 [ 1010.523601] [<ffffffff8122fb8e>] ? i915_gem_object_set_to_gtt_domain+0x2e/0xb0 [ 1010.523623] [<ffffffff8123113b>] ? i915_gem_set_domain_ioctl+0xdb/0x1f0 [ 1010.523644] [<ffffffff8120a3f1>] ? drm_ioctl+0x3d1/0x460 [ 1010.523660] [<ffffffff81231060>] ? i915_gem_set_domain_ioctl+0x0/0x1f0 [ 1010.523682] [<ffffffff81092618>] ? vma_prio_tree_insert+0x28/0x120 [ 1010.523701] [<ffffffff8109f379>] ? vma_link+0x99/0xf0 [ 1010.523717] [<ffffffff810a111d>] ? mmap_region+0x1ed/0x4f0 [ 1010.523734] [<ffffffff810c306f>] ? do_vfs_ioctl+0x9f/0x580 [ 1010.523750] [<ffffffff810c3599>] ? sys_ioctl+0x49/0x80 [ 1010.523767] [<ffffffff810022eb>] ? system_call_fastpath+0x16/0x1b [ 1010.523785] Code: 00 00 00 00 00 41 57 89 ce 41 56 41 55 41 54 45 89 c4 55 48 89 fd 53 48 89 d3 44 89 c2 48 89 df 4c 8d b3 c8 00 00 00 48 83 ec 18 <ff> 93 88 00 00 00 48 8b 83 c8 00 00 00 4c 8b bd 30 03 00 00 48 [ 1010.523946] RIP [<ffffffff8122d006>] i915_gem_flush_ring+0x26/0x140 [ 1010.523966] RSP <ffff88014bf97cd8> [ 1010.523977] CR2: 0000000000000088 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	e1f99ce6ca	drm/i915: Propagate errors from writing to ringbuffer Preparing the ringbuffer for adding new commands can fail (a timeout whilst waiting for the GPU to catch up and free some space). So check for any potential error before overwriting HEAD with new commands, and propagate that error back to the user where possible. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 23:26:34 +01:00
Chris Wilson	78501eac34	drm/i915/ringbuffer: Drop the redundant dev from the vfunc interface The ringbuffer keeps a pointer to the parent device, so we can use that instead of passing around the pointer on the stack. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-27 12:18:21 +01:00
Linus Torvalds	c48c43e422	Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (476 commits) vmwgfx: Implement a proper GMR eviction mechanism drm/radeon/kms: fix r6xx/7xx 1D tiling CS checker v2 drm/radeon/kms: properly compute group_size on 6xx/7xx drm/radeon/kms: fix 2D tile height alignment in the r600 CS checker drm/radeon/kms/evergreen: set the clear state to the blit state drm/radeon/kms: don't poll dac load detect. gpu: Add Intel GMA500(Poulsbo) Stub Driver drm/radeon/kms: MC vram map needs to be >= pci aperture size drm/radeon/kms: implement display watermark support for evergreen drm/radeon/kms/evergreen: add some additional safe regs v2 drm/radeon/r600: fix tiling issues in CS checker. drm/i915: Move gpu_write_list to per-ring drm/i915: Invalidate the to-ring, flush the old-ring when updating domains drm/i915/ringbuffer: Write the value passed in to the tail register agp/intel: Restore valid PTE bit for Sandybridge after `bdd3072` drm/i915: Fix flushing regression from `9af90d19f` drm/i915/sdvo: Remove unused encoding member i915: enable AVI infoframe for intel_hdmi.c [v4] drm/i915: Fix current fb blocking for page flip drm/i915: IS_IRONLAKE is synonymous with gen == 5 ... Fix up conflicts in - drivers/gpu/drm/i915/{i915_gem.c, i915/intel_overlay.c}: due to the new simplified stack-based kmap_atomic() interface - drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: added .llseek entry due to BKL removal cleanups.	2010-10-26 18:57:59 -07:00
Peter Zijlstra	3e4d3af501	mm: stack based kmap_atomic() Keep the current interface but ignore the KM_type and use a stack based approach. The advantage is that we get rid of crappy code like: #define __KM_PTE \ (in_nmi() ? KM_NMI_PTE : \ in_irq() ? KM_IRQ_PTE : \ KM_PTE0) and in general can stop worrying about what context we're in and what kmap slots might be appropriate for that. The downside is that FRV kmap_atomic() gets more expensive. For now we use a CPP trick suggested by Andrew: #define kmap_atomic(page, args...) __kmap_atomic(page) to avoid having to touch all kmap_atomic() users in a single patch. [ not compiled on: - mn10300: the arch doesn't actually build with highmem to begin with ] [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c] Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: David Miller <davem@davemloft.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:08 -07:00
Chris Wilson	641934069d	drm/i915: Move gpu_write_list to per-ring ... to prevent flush processing of an idle (or even absent) ring. This fixes a regression during suspend from `87acb0a5`. Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Tested-by: Peter Clifton <pcjc2@cam.ac.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-24 20:22:51 +01:00
Chris Wilson	b6651458d3	drm/i915: Invalidate the to-ring, flush the old-ring when updating domains When the object has been written to by the gpu it remains on the ring until its flush has been retired. However, when the object is moving to the ring and the associated cache needs to be invalidated, we need to perform the flush on the target ring, not the one it came from (which is NULL in the reported case and so the flush was entirely absent). Reported-by: Peter Clifton <pcjc2@cam.ac.uk> Reported-and-tested-by: Alexey Fisher <bug-track@fisher-privat.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-23 11:07:21 +01:00
Chris Wilson	878a3c37d3	drm/i915: Fix flushing regression from `9af90d19f` Whilst moving the code around in `9af90d19f`, I dropped the or'ing in of new write domains which would zero out the write domain for a render target if later reused as a source later in the batch. This meant that we might drop a required flush before reading from the render target. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31043 Reported-by: xunx.fang@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-22 10:48:12 +01:00
Chris Wilson	549f736582	drm/i915: Enable SandyBridge blitter ring Based on an original patch by Zhenyu Wang, this initializes the BLT ring for SandyBridge and enables support for user execbuffers. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-21 19:08:39 +01:00
Chris Wilson	b5dc608c98	drm/i915: Copy the updated reloc->presumed_offset back to the user If the userspace driver is using a constant relocation array with a static buffer, they will pass the same relocation array back to the kernel. So we do need to update the presumed offset value in those relocations to reflect the current object so that they remain correct with future batchbuffers and we avoid the necessity of having to suspend execution and perform redundant relocations. Fixes the regression introduced by `12f889c` for applications using absolute addressing on trees of buffer (i.e. the current consumers of libdrm_intel.so). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30996 Reported-by: Wang, Jinjin <jinjin.wang@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 21:06:34 +01:00
Chris Wilson	69dc4987cb	drm/i915: Track objects in global active list (as well as per-ring) To handle retirements, we need per-ring tracking of active objects. To handle evictions, we need global tracking of active objects. As we enable more rings, rebuilding the global list from the individual per-ring lists quickly grows tiresome and overly complicated. Tracking the active objects in two lists is the lesser of two evils. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	87acb0a550	drm/i915: Simplify most HAS_BSD() checks ... by always initialising the empty ringbuffer it is always then safe to check whether it is active. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:51 +01:00
Chris Wilson	9af90d19f8	drm/i915: cache the last object lookup during pin_and_relocate() The most frequent relocation within a batchbuffer is a contiguous sequence of vertex buffer relocations, for which we can virtually eliminate the drm_gem_object_lookup() overhead by caching the last handle to object translation. In doing so we refactor the pin and relocate retry loop out of do_execbuffer into its own helper function and so improve the error paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-20 10:51:50 +01:00
Chris Wilson	1d7cfea152	drm/i915: Do interrupible mutex lock first to avoid locking for unreference One of the primarily consumers of the i915 driver is X, a large signal driven application. Frequently when writing into the buffers, there is a pending signal which causes us not to take the interruptible lock but then we need to take that same lock around the object unreference. By rearranging the code to do the interruptible lock as the first check, we can avoid the frequent additional locking around the unreference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:20:23 +01:00
Chris Wilson	4f27b75d56	drm/i915: rearrange mutex acquisition for pread ... to avoid the double acquisition along fast[er] paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:55 +01:00
Chris Wilson	fbd5a26d50	drm/i915: Rearrange acquisition of mutex during pwrite ... to avoid reacquiring it to drop the object reference count on exit. Note we have to make sure we now drop (and reacquire) the lock around acquiring the mm semaphore on the slow paths. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:47 +01:00
Chris Wilson	b5e4feb661	drm/i915: Attempt to prefault user pages for pread/pwrite ... in the hope that it makes the atomic fast paths more likely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:37 +01:00
Chris Wilson	202f2fef7a	drm/i915: Avoid taking the mutex for dropping the refcnt upon creation After allocation a handle for the fresh object, we know that we can safely drop the refcnt without triggering a free so we do not need the mutex. Strangely, this mutex acquisition is the one that appears on driver profiles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:28 +01:00
Chris Wilson	f0c43d9b7e	drm/i915: Perform relocations in CPU domain [if in CPU domain] Avoid an early eviction of the batch buffer into the uncached GTT domain, and so do the relocation fixup in cacheable memory. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:19:18 +01:00
Chris Wilson	2549d6c26c	drm/i915: Avoid vmallocing a buffer for the relocations ... perform an access validation check up front instead and copy them in on-demand, during i915_gem_object_pin_and_relocate(). As around 20% of the CPU overhead may be spent inside vmalloc for the relocation entries when submitting an execbuffer [for x11perf -aa10text], the savings are considerable and result in around a 10% throughput increase [for glyphs]. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-19 09:18:36 +01:00
Chris Wilson	e59f2bac15	drm/i915: Wait for pending flips on the GPU Currently, if a batch buffer refers to an object with a pending flip, then we sleep until that pending flip is completed (unpinned and signalled). This is so that a flip can be queued and the user can continue rendering to the backbuffer oblivious to whether the buffer is still pinned as the scan out. (The kernel arbitrating at the last moment to stall the batch and wait until the buffer is unpinned and replaced as the front buffer.) As we only have a queue depth of 1, we can simply wait for the current pending flip to complete and continue rendering. We can achieve this with a single WAIT_FOR_EVENT command inserted into the ring buffer prior to executing the batch, without stalling the client. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-07 19:10:09 +01:00
Dave Airlie	fb7ba2114b	Merge remote branch 'korg/drm-fixes' into drm-vmware-next necessary for some of the vmware fixes to be pushed in. Conflicts: drivers/gpu/drm/drm_gem.c drivers/gpu/drm/i915/intel_fb.c include/drm/drmP.h	2010-10-06 11:10:48 +10:00
Linus Torvalds	c470af0a27	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ickle/drm-intel: drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow drm/i915: Sanity check pread/pwrite drm/i915: Use pipe state to tell when pipe is off drm/i915: vblank status not valid while training display port drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code drm/i915: Fix refleak during eviction. drm/i915: fix GMCH power reporting	2010-10-04 11:10:26 -07:00
Chris Wilson	35b62a89b0	drm/i915: Skip pread/pwrite if size to copy is 0. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-04 10:07:46 +01:00
Chris Wilson	df6d075a4d	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-10-04 10:07:38 +01:00
Chris Wilson	7dcd2499de	drm/i915: Rephrase pwrite bounds checking to avoid any potential overflow ... and do the same for pread. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:18 +01:00
Chris Wilson	ce9d419dbe	drm/i915: Sanity check pread/pwrite Move the access control up from the fast paths, which are no longer universally taken first, up into the caller. This then duplicates some sanity checking along the slow paths, but is much simpler. Tracked as CVE-2010-2962. Reported-by: Kees Cook <kees@ubuntu.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-03 14:16:17 +01:00
Chris Wilson	58e10eb92d	Merge branch 'drm-intel-fixes' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_gem_evict.c drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_dp.c	2010-10-03 10:56:11 +01:00
Julia Lawall	929f49bf22	drivers/gpu/drm/i915/i915_gem.c: Add missing error handling code Extend the error handling code with operations found in other nearby error handling code A simplified version of the sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ @r@ statement S1,S2,S3; constant C1,C2,C3; @@ if (...) {... S1 return -C1;} ... if (...) {... when != S1 return -C2;} ... *if (...) {... S1 return -C3;} // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-10-02 15:21:26 +01:00
Chris Wilson	1cdf7fef79	drm/i915: Don't mask the return code whilst relocating. The return from move_to_gtt_domain() may indicate a pending signal which needs to handled as opposed to an actual error, for instance, so report the original return value rather than forcing an EINVAL. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-02 15:12:41 +01:00
Linus Torvalds	18ffe4b18c	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: vmwgfx: Fix fb VRAM pinning failure due to fragmentation vmwgfx: Remove initialisation of dev::devname vmwgfx: Enable use of the vblank system vmwgfx: vt-switch (master drop) fixes drm/vmwgfx: Fix breakage introduced by commit "drm: block userspace under allocating buffer and having drivers overwrite it (v2)" drm: Hold the mutex when dropping the last GEM reference (v2) drm/gem: handlecount isn't really a kref so don't make it one. drm: i810/i830: fix locked ioctl variant drm/radeon/kms: add quirk for MSI K9A2GM motherboard drm/radeon/kms: fix potential segfault in r600_ioctl_wait_idle drm: Prune GEM vma entries drm/radeon/kms: fix up encoder info messages for DFP6 drm/radeon: fix PCI ID 5657 to be an RV410	2010-10-01 10:58:31 -07:00
Chris Wilson	069efc1dac	drm/i915: Clear fence registers on GPU reset When the GPU is reset, the fence registers are invalidated, so release the objects and clear them out. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:22 +01:00
Chris Wilson	812ed49243	drm/i915: Force the domain to CPU on unbinding whilst wedged. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30083 Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-10-01 14:45:21 +01:00
Chris Wilson	73aa808f10	drm: Move the GTT accounting to i915 Only drm/i915 does the bookkeeping that makes the information useful, and the information maintained is driver specific, so move it out of the core and into its single user. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Airlie <airlied@redhat.com>	2010-10-01 14:45:20 +01:00
Dave Airlie	29d08b3efd	drm/gem: handlecount isn't really a kref so don't make it one. There were lots of places being inconsistent since handle count looked like a kref but it really wasn't. Fix this my just making handle count an atomic on the object, and have it increase the normal object kref. Now i915/radeon/nouveau drivers can drop the normal reference on userspace object creation, and have the handle hold it. This patch fixes a memory leak or corruption on unload, because the driver had no way of knowing if a handle had been actually added for this object, and the fbcon object needed to know this to clean itself up properly. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-10-01 09:17:44 +10:00
Chris Wilson	f394940b8d	drm/i915: Remove redundant deletion of obj->gpu_write_list At that point as the object is no longer in any GPU write domain it must not be on the list, so the list_del() is redundant. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:51 +01:00
Chris Wilson	5cdf588174	drm/i915: Make get/put pages static Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:13 +01:00
Chris Wilson	23bc598253	drm/i915/debug: Convert i915_verify_active() to scan all lists ... and check more regularly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-30 09:30:11 +01:00
Chris Wilson	891b48cfc8	drm/i915: Avoid blocking the kworker thread on a stuck mutex Just reschedule the retire requests again if the device is currently busy. The request list will be pruned along other paths so will never grow unbounded and so we can afford to miss the occasional pruning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 12:26:37 +01:00
Chris Wilson	3d2a812ae4	drm/i915/debug: Remove default WATCH_BUF Replaced by tracepoints. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:19 +01:00
Chris Wilson	97d1ebaf81	drm/i915/debug: Remove defunct WATCH_LRU This has bitrotted through inuse and superseded by tracing and debugfs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-29 11:41:18 +01:00
Chris Wilson	e0e41598b4	Merge branch 'drm-intel-fixes' into drm-intel-next	2010-09-28 15:48:38 +01:00
Chris Wilson	a56ba56c27	Revert "drm/i915: Drop ring->lazy_request" With multiple rings generating requests independently, the outstanding requests must also be track independently. Reported-by: Wang Jinjin <jinjin.wang@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30380 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-28 11:30:52 +01:00
Chris Wilson	ced270fa89	drm/i915: Ensure that the mode change flushing is currently uninterruptible Introduced by `48b956c5`, I had thought I had already fixed this. Oh well. Reported-by: Sitsofe Wheeler <sitsofe@yahoo.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 22:50:36 +01:00
Chris Wilson	1c25595f8d	drm/i915: Convert the file mutex into a spinlock Daniel Vetter pointed out that in this case is would be clearer and cleaner to use a spinlock instead of a mutex to protect the per-file request list manipulation. Make it so. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-26 11:03:27 +01:00
Chris Wilson	76c1dec197	drm/i915: Make the mutex_lock interruptible on ioctl paths ... and combine it with the wedged completion handler. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	30dbf0c07f	drm/i915: Adjust hangcheck EIO semantics Owain Ainsworth reported an issue between the interaction of the hangcheck and userspace immediately (and permanently) falling back to s/w rasterisation. In order to break the mutex and begin resetting the GPU, we must abort the current operation (usually within the wait) and climb sufficiently far back up the call chain to drop the mutex. In his implementation, Owain has a loop within the ioctl handler to detect the hang and then sleep until the error handler has run. I've chosen to return to userspace and report an EAGAIN which should trigger the userspace ioctl handler to repeat the call (simply because it felt less invasive...). Before hitting a wedged GPU, we then wait upon completion of the error handler. Reported-by: Owain G. Ainsworth <zerooa@googlemail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-25 12:23:12 +01:00
Chris Wilson	f787a5f59e	drm/i915: Only hold a process-local lock whilst throttling. Avoid cause latencies in other clients by not taking the global struct mutex and moving the per-client request manipulation a local per-client mutex. For example, this allows a compositor to schedule a page-flip (through X) whilst an OpenGL application is monopolising the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 21:03:00 +01:00
Chris Wilson	e6c3a2a6d3	drm/i915: Use an uninterruptible wait for page-flips during modeset We need to drain the pending flips prior to disabling the pipe during modeset, and these need to be done in an uninterruptible fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-24 14:19:57 +01:00
Chris Wilson	20f0cd55f6	drm/i915: Remove the broken flush_ring from page-flip This is already performed with the pipelined flush, so by the time we schedule the flush in the page-flip, the ring is NULL and we OOPs instead. Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-23 11:02:55 +01:00
Chris Wilson	9b74f7348f	drm/i915: Fix 945GM regression in `e259befd` A minor typo caused a single fence register to be incorrectly programmed, resulting in occassional tiling corruption. Reported-and-tested-by: Hans de Bruin <bruinjm@xs4all.nl> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=18962 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-23 10:30:57 +01:00
Chris Wilson	5c12a07e80	drm/i915: Drop ring->lazy_request We are not currently using it as intended, so remove the complication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-22 11:58:55 +01:00
Chris Wilson	dfaae392f4	drm/i915: Clear the gpu_write_list on resetting write_domain upon hang Otherwise we will hit a list handling assertion when moving the object to the inactive list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-22 10:31:52 +01:00
Chris Wilson	9e0ae53404	drm/i915: Don't overwrite the returned error-code During i915_gem_create_mmap_offset() if the subsystem reports an error code, use it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 15:05:24 +01:00
Chris Wilson	f13d3f7311	drm/i915: Track pinned objects Keep a list of pinned objects and display it via debugfs. Now all objects that exist in the GTT are always tracked on one of the active, flushing, inactive or pinned lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:17 +01:00
Chris Wilson	265db9585e	drm/i915: Drain any pending flips on the fb prior to unpinning If we have queued a page flip on the current fb and then request a mode change, wait until the page flip completes before performing the new request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:17 +01:00
Chris Wilson	c78ec30bba	drm/i915: Merge ring flushing and lazy requests Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:24:16 +01:00
Chris Wilson	53640e1d07	drm/i915: Track gpu fence usage Track if the gpu requires the fence for the execution of a batch buffer and so only wait upon the retirement of the object's last rendering seqno if the fence is in use by the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:20:54 +01:00
Chris Wilson	c7f9f9a8b8	drm/i915: Use ring->flush() instead of MI_FLUSH Use the ring abstraction to hide the details of having choose the appropriate flushing method. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:59 +01:00
Xiang, Haihao	5c1143bbec	drm/i915: do not export the instances of struct intel_ring_buffer Introduce intel_init_render_ring_buffer(), intel_init_bsd_ring_buffer for ring initialization. Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:55 +01:00
Chris Wilson	77f0123022	drm/i915: Clear GPU read domains on reset Clear the GPU read domain for the inactive objects on a reset so that they are correctly invalidated on reuse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:53 +01:00
Chris Wilson	9375e446e7	drm/i915: Clear flushing lists on GPU reset Owain Ainsworth noticed that the reset code failed to clear the flushing list leaving the driver in an inconsistent state following a hung GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:52 +01:00
Chris Wilson	9220434a87	drm/i915: Only emit a flush request on the active ring. When flushing the GPU domains,we emit a flush on both rings, even though they share a unified cache. Only emit the flush on the currently active ring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:51 +01:00
Chris Wilson	b84d5f0c22	drm/i915: Inline i915_gem_ring_retire_request() Change the semantics to retire any buffer older than the current seqno rather than repeatedly calling calling the function to retire the buffer at the head of the list matching the request seqno. Whilst this should have no semantic impact on the implementation, Daniel was wondering if there was a bug where we might miss a retirement and so end up with a continually growing active list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:50 +01:00
Chris Wilson	a6c45cf013	drm/i915: INTEL_INFO->gen supercedes i8xx, i9xx, i965g Avoid confusion between i965g meaning broadwater and the gen4+ chipset families. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-21 11:19:45 +01:00
Chris Wilson	e9e5f8e8d3	Merge branch 'drm-intel-fixes' into HEAD Conflicts: drivers/char/agp/intel-agp.c drivers/gpu/drm/i915/intel_crt.c	2010-09-21 11:19:32 +01:00
Chris Wilson	e259befd90	drm/i915: Fix Sandybridge fence registers With 5 places to update when adding handling for fence registers, it is easy to overlook one or two. Correct that oversight, but fence management should be improved before a new set of registers is added. Bugzilla: https://bugs.freedesktop.org/show_bug?id=30199 Original patch by: Yuanhan Liu <yuanhan.liu@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-17 08:18:30 +01:00
Chris Wilson	2b6efaa476	drm/i915: Remove unused intel_ringbuffer->ring_flag This can always be re-added should somebody find a use... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:13:00 +01:00
Chris Wilson	2cf34d7b7e	drm/i915: Allow get_fence_reg() to be uninterruptible As we currently may need to acquire a fence register during a modeset, we need to be able to do so in an uninterruptible manner. So expose that parameter to the callers of the fence management code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:08:36 +01:00
Chris Wilson	48b956c5a8	drm/i915: Push pipelining of display plane flushes to the caller This ensures that we do wait upon the flushes to complete if necessary and avoid the visual tears, whilst enabling pipelined page-flips. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 21:08:35 +01:00
Chris Wilson	0bc23aad3b	drm/i915: Fix regression in `ba3d8d749b` I pulled the wrong version of the patch from Daniel Vetter which was missing the read barriers -- and the one that was causing all the trouble was from i915_gem_object_put_fence_reg(), leading to GPU hangs on gen3. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:35:43 +01:00
Chris Wilson	7213342db5	drm/i915: Consolidate flushing the display plane Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:34:27 +01:00
Chris Wilson	b3b079dbef	drm/i915: Reduce hangcheck frequency By reducing the hangcheck frequency we check less often, conserving resources, and still detect a lock up quickly. On a fast machine with a slow GPU (like a Core2 paired with a 945G) it is easy for the hangcheck to misfire as we check too fast. Also once hung and if we fail to completely reset the chip, we have a nasty habit of proclaming a hang many times a second and generating a strobe-like display. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-14 10:30:10 +01:00
Chris Wilson	995b6762f0	drm/i915: Quieten sparse warnings for missing prototypes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:57 +01:00
Chris Wilson	de227ef090	drm/i915: Kill the active list spinlock This spinlock only served debugging purposes in a time when we could not be sure of the mutex ever being released upon a GPU hang. As we now should be able rely on hangcheck to do the job for us (and that error reporting should not itself require the struct mutex) we can kill the incomplete attempt at protection. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:56 +01:00
Chris Wilson	8dc5d14741	drm/i915: Preallocate requests By allocating the request prior to writing to the ringbuffer, we can abort the operation without leaving the GPU in an inconsistent state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2010-09-08 10:23:50 +01:00
Daniel Vetter	4fc6ee7646	drm/i915: drop i915_add_request right in front of i915_wait_request ... take advantage of the new implicit request issuing of i915_wait_request. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:39 +01:00
Daniel Vetter	ba3d8d749b	drm/i915: move the wait_rendering call into flush_gpu_write_domain One caller (for the pageflip support) wants a purely pipelined flush. Distinguish this case by a new parameter. This will also be useful later on for pipelined fencing. v2: Simplify the code by depending upon the implicit request emitting of i915_wait_request. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [ickle: And drop the non-interruptible support in the process.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:38 +01:00
Daniel Vetter	617dbe2787	drm/i915: drop seqno argument from i915_gem_object_move_to_active By moving one i915_add_request we can solely depend on the new auto-seqno-numbering behaviour. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:37 +01:00
Daniel Vetter	86394c669a	drm/i915: kill a no longer necessary BUG_ON i915_gem_object_move_to_active can handle zero seqno for us now. And not emitting a request is not fatal here - we'll try to emit a new one if we have to wait for some rendering to complete. In case this assumption ever gets accidentally broken, there's already a BUG_ON to catch it in i915_do_wait_request. So just silently ignore ENOMEM here instead of screwing up the whole drm. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:37 +01:00
Daniel Vetter	8a1a49f954	drm/i915: move flushing list processing to i915_retire_commands ... instead of threading flush_domains through the execbuf code to i915_add_request. With this change 2 small cleanups are possible (likewise the majority of the patch): - The flush_domains parameter of i915_add_request is always 0. Drop it and the corresponding logic. - Ditto for the seqno param of i915_gem_process_flushing_list. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:36 +01:00
Daniel Vetter	a6910434e1	drm/i915: only one interrupt per batchbuffer is not enough! Previously I thought that one interrupt per batchbuffer should be enough. Now tedious benchmarking showed this to be wrong. Therefore track whether any commands have been isssued with a future seqno (like pipelined fencing changes or flushes). If this is the case emit a request before issueing the batchbuffer. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:35 +01:00
Daniel Vetter	8bff917c93	drm/i915: move flushing list processing to i915_gem_flush Now that we can move objects to the active list without already having emitted a request, move the flushing list handling into i915_gem_flush. This makes more sense and allows to drop a few i915_add_request calls that are not strictly necessary. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:35 +01:00
Daniel Vetter	e35a41de39	drm/i915: allow lazy emitting of requests Sometimes (like when flushing in preparation of batchbuffer execution) we know that we'll emit a request but haven't yet done so. Allow this case by simply taking the next seqno by default. Ensure that a request is eventually emitted before waiting for an request by issuing it in i915_wait_request iff this is not yet done. Also replace one open-coded version of i915_gem_object_wait_rendering, to prevent future code-diversion. Chris Wilson asked me to explain and clarify what this patch does and why. Here it goes: Old way of moving objects onto the active list and associating them with a reques: 1. i915_add_request + store the returned seqno somewhere 2. i915_gem_object_move_to_active (with the stored seqno as parameter) For the current users, this is all fine. But I'd like to associate objects (and fence regs) with the batchbuffer request deep down in the execbuf call-chain. I thought about three ways of implementing this. a) Don't care, just emit request when we need a new seqno. When heavily pipelining fence reg changes, this would have caused tons of superflous request (and corresponding irqs). b) Thread all changed fences, objects, whatever through the execbuf-maze, so that when we emit a request, we can store the new seqno at all the right places. c) Kill that seqno-threading-around business by simply storing the next seqno, i.e. allow 2. to be done before 1. in the above sequence. I've decided to implement c) (in this patch). The following patches are just fall-out that resulted from this small conceptual change. * We can handle the flushing list processing where we actually emit a flush (i915_gem_flush and i915_retire_commands) instead of in i915_add_request. The code makes IMHO more sense this way (and i915_add_request looses the flush_domains parameter, obviously). * We can avoid emitting unnecessary requests. IMHO there's no point in emitting more than one request per batchbuffer (with or without an corresponding irq). * By enforcing 2. before 1. ordering in the above sequence the seqno argument of i915_gem_object_move_to_active is redundant and can be dropped. v2: Now i915_wait_request issues request if it is not yet emitted. Also introduce i915_gem_next_request_seqno(dev) just in case we ever need to do some prep work before using a new seqno. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [ickle: Keep i915_gem_object_set_to_display_plane() uninterruptible.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:23:34 +01:00
Daniel Vetter	75ef9da2cd	drm/i915: unload: fix retire_work races ums-gem code correctly cancels the retire work (at lastclose time), kms does not do so. Fix this by canceling the work right after ideling the gpu. While staring at the code I noticed that the work function is not static. Fix this, too. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:13:28 +01:00
Daniel Vetter	bc0c7f1443	drm/i915: unload: fix error_work races This is the first patch to clean up module unload races due to outstanding timers/work. Preparatory step: Thou shalt not destroy the workqueue when new work might still get enqued. Now error_work gets queued by the hangcheck timer and only (atomically) reads the chip wedged status. So cancel it right after the hangcheck timer is killed. But the hangcheck is armed by interrupts, so move everything after irqs are disabled. Also change a del_timer to a del_timer_sync in the ums gem code, the hangcheck timer is self-rearming. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-08 10:13:24 +01:00
Zhenyu Wang	f8f235e5bb	agp/intel: Fix cache control for Sandybridge Sandybridge GTT has new cache control bits in PTE, which controls graphics page cache in LLC or LLC/MLC, so we need to extend the mask function to respect the new bits. And set cache control to always LLC only by default on Gen6. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2010-09-07 11:16:43 +01:00
Dan Carpenter	c877cdce93	i915: return -EFAULT if copy_to_user fails copy_to_user() returns the number of bytes remaining to be copied and I'm pretty sure we want to return a negative error code here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-06 23:09:54 +01:00
Chris Wilson	1dfd9754cd	Revert "drm/i915: Unreference object not handle on creation" This reverts commit `86f100b136`. The kref API requires the handlecount to be initialised to one on object creation (so that kref_get() doesn't complain upon first use) so the dalliance in the drivers is required in order to sink the initial floating reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org	2010-09-06 23:09:49 +01:00
Linus Torvalds	4238a417a9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (58 commits) drm/i915,intel_agp: Add support for Sandybridge D0 drm/i915: fix render pipe control notify on sandybridge agp/intel: set 40-bit dma mask on Sandybridge drm/i915: Remove the conflicting BUG_ON() drm/i915/suspend: s/IS_IRONLAKE/HAS_PCH_SPLIT/ drm/i915/suspend: Flush register writes before busy-waiting. i915: disable DAC on Ironlake also when doing CRT load detection. drm/i915: wait for actual vblank, not just 20ms drm/i915: make sure eDP PLL is enabled at the right time drm/i915: fix VGA plane disable for Ironlake+ drm/i915: eDP mode set sequence corrections drm/i915: add panel reset workaround drm/i915: Enable RC6 on Ironlake. drm/i915/sdvo: Only set is_lvds if we have a valid fixed mode. drm/i915: Set up a render context on Ironlake drm/i915 invalidate indirect state pointers at end of ring exec drm/i915: Wake-up wait_request() from elapsed hang-check (v2) drm/i915: Apply i830 errata for cursor alignment drm/i915: Only update i845/i865 CURBASE when disabled (v2) drm/i915: FBC is updated within set_base() so remove second call in mode_set() ...	2010-08-22 11:03:27 -07:00
Chris Wilson	156dadc180	drm/i915: Remove the conflicting BUG_ON() We now attempt to free "active" objects following a GPU hang as either the GPU will be reset or the hang is permenant. In either case, the GPU writes will not be flushed to main memory and it should be safe to return that memory back to the system. The BUG_ON(active) is thus overkill and can erroneously fire after a EIO. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-21 23:21:13 -07:00
Chris Wilson	bf79cb914d	drm: Use ENOENT consistently for the error return for an unmatched handle. This is consistent with trying to access a filename that not exist within a directory which is a good analogy here. The main reason for the change is that it is easy to confuse the error code of EBADF as an performing an ioctl on an invalid file descriptor (rather than an unknown object). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-08-10 10:46:55 +10:00
Chris Wilson	6eeefaf3c8	drm/i915: Apply i830 errata for cursor alignment i830 requires 32bpp cursors to be aligned to 16KB, so we have to expose the alignment parameter to i915_gem_attach_phys_object(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:36 -07:00
Chris Wilson	ae9fed6b60	drm/i915: Truncate the shmem backing pages on purge shmfs doesn't actually implement i_ops->truncate() so we were not immedatiately releasing the backing pages when shrinking the gfx cache under OOM. Instead use a combination of truncate_inode_pages() and i_ops->truncate_range() as is used by shmem_delete_inode(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:34 -07:00
Chris Wilson	7d1c4804ae	drm/i915: Maintain LRU order of inactive objects upon access by CPU (v2) In order to reduce the penalty of fallbacks under memory pressure and to avoid a potential immediate ping-pong of evicting a mmaped buffer, we move the object to the tail of the inactive list when a page is freshly faulted or the object is moved into the CPU domain. We choose not to protect the CPU objects from casual eviction, preferring to keep the GPU active for as long as possible. v2: Daniel Vetter found a bug where I forgot that pinned objects are kept off the inactive list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:33 -07:00
Chris Wilson	b47eb4a2b3	drm/i915: Move the eviction logic to its own file. The eviction code is the gnarly underbelly of memory management, and is clearer if kept separated from the normal domain management in GEM. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	6f392d5486	drm/i915: Use a common seqno for all rings. This will be used by the eviction logic to maintain fairness between the rings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Daniel Vetter	0108a3edd5	drm/i915: prepare for fair lru eviction This does two little changes: - Add an alignment parameter for evict_something. It's not really great to whack a carefully sized hole into the gtt with the wrong alignment. Especially since the fallback path is a full evict. - With the inactive scan stuff we need to evict more that one object, so move the unbind call into the helper function that scans for the object to be evicted, too. And adjust its name. No functional changes in this patch, just preparation. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	bf1a109239	drm/i915: Append the object onto the inactive list on binding. In order to properly track bound objects, they need to exist on one of the inactive/active lists or be pinned. As this is a requirement, do the work inside i915_gem_bind_to_gtt() rather than dotted around the callsites. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:32 -07:00
Chris Wilson	ae7d49d879	drm/i915: Emit a backtrace if we attempt to rebind a pinned buffer This debugging trace was useful for finding the fbcon regression on i965, and it may prove useful again in future. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:30 -07:00
Chris Wilson	0be555b66a	drm/i915: report all active objects as busy Incorporates a similar patch by Daniel Vetter, the alteration being to report the current busy state after retiring. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:30 -07:00
Chris Wilson	88f356b725	drm/i915: Only emit flushes on active rings. This avoids the excess flush and requests on idle rings (and spamming the debug log ;-) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-09 11:24:29 -07:00
Chris Wilson	fca3ec01e0	drm,io-mapping: Specify slot to use for atomic mappings This is required should we ever attempt to use an io-mapping where KM_USER0 is verboten, such as inside an IRQ context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-08-05 08:48:53 +10:00
Chris Wilson	86f100b136	drm/i915: Unreference object not handle on creation When creating an object, we create the handle by which it is known to the process and which own the reference to the object. That reference to the new handle is what we want to transfer to the process, not the lost reference to the object; so free the local object reference not the process's handle reference. This brings i915_gem_object_create_ioctl() into line with drm_gem_open_ioctl() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:58:06 -07:00
Chris Wilson	8dc1775dce	drm/i915: Attempt to uncouple object after catastrophic failure in unbind If we fail to flush outstanding GPU writes but return the memory to the system, we risk corrupting memory should the GPU recovery and complete those writes. On the other hand, if we bail early and free the object then we have a definite use-after-free and real memory corruption. Choose the lesser of two evils, since in order to recover from the hung GPU we need to completely reset it, those pending writes should never happen. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:56:29 -07:00
Chris Wilson	be72615bcf	drm/i915: Repeat unbinding during free if interrupted (v6) If during the freeing of an object the unbind is interrupted by a system call, which is quite possible if we have outstanding GPU writes that must be flushed, the unbind is silently aborted. This still leaves the AGP region and backing pages allocated, and perhaps more importantly, the object remains upon the various lists exposing us to memory corruption. I think this is the cause behind the use-after-free, such as Bug 15664 - Graphics hang and kernel backtrace when starting Azureus with Compiz enabled https://bugzilla.kernel.org/show_bug.cgi?id=15664 v2: Daniel Vetter reminded me that kernel space programming is never easy. We cannot simply spin to clear the pending signal and so must deferred the freeing of the object until later. v3: Run from the top level retire requests. v4: Tested with P(return -ERESTARTSYS)=.5 from i915_gem_do_wait_request() v5: Rebase against Eric's for-linus tree. v6: Refactor, split and add a comment about avoiding unbounded recursion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:53:24 -07:00
Chris Wilson	b09a1feca6	drm/i915: Refactor i915_gem_retire_requests() Combine the iteration over active render rings into a common function. This is in preparation for reusing the idle function to also retire deferred free requests. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:52:57 -07:00
Chris Wilson	2dafb1e082	drm/i915: Propagate error from i915_gem_object_flush_gpu_write_domain() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:44 -07:00
Chris Wilson	5f35308bab	drm/i915: Propagate error from drm_install_irq() during EnterVT Simple fix for error propagation along the old UMS path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:44 -07:00
Chris Wilson	43b27f40eb	drm/i915: Explosion following OOM in do_execbuffer. Oops, when merging the extra details following an OOM, I missed that driver_private is now NULL and the correct way to convert from the drm_gem_object into the drm_i915_gem_object is to use to_intel_bo(). BUG: unable to handle kernel NULL pointer dereference at 00000069 IP: [<c11a4a02>] i915_gem_do_execbuffer+0x71f/0xbb6 *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/vc/vcsa3/uevent Pid: 10993, comm: X Not tainted 2.6.35-rc2+ #67 / EIP: 0060:[<c11a4a02>] EFLAGS: 00213202 CPU: 0 EIP is at i915_gem_do_execbuffer+0x71f/0xbb6 EAX: f647e8a8 EBX: 00000000 ECX: 00000003 EDX: 00000000 ESI: 00424000 EDI: 00000000 EBP: f6508e48 ESP: f6508dd4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process X (pid: 10993, ti=f6508000 task=f6432880 task.ti=f6508000) Stack: f6508de0 f7130000 00000001 00000000 00000000 f647e8a8 00000000 f64f8480 <0> f7974414 00000000 00000006 00000000 00000000 f6578000 00000008 00000006 <0> f6797880 00400000 00000000 ffffffe4 f7974400 000000d0 000000d0 000001c0 Call Trace: [<c11a4f3a>] ? i915_gem_execbuffer2+0xa1/0xe7 [<c118ab96>] ? drm_ioctl+0x22c/0x2fa [<c11a4e99>] ? i915_gem_execbuffer2+0x0/0xe7 [<c107e88c>] ? do_sync_read+0x8f/0xca [<c1088cbd>] ? vfs_ioctl+0x2c/0x96 [<c118a96a>] ? drm_ioctl+0x0/0x2fa [<c10891f4>] ? do_vfs_ioctl+0x429/0x45a [<c107e5c9>] ? fsnotify_access+0x54/0x5f [<c107ee1c>] ? vfs_read+0x9a/0xae [<c1089258>] ? sys_ioctl+0x33/0x4d [<c1002610>] ? sysenter_do_call+0x12/0x26 Code: d0 89 4d c4 31 c9 89 45 d8 eb 44 8b 45 cc 8b 14 88 8b 42 50 89 45 bc 8b 45 a0 8b 52 38 89 55 d0 31 d2 f6 40 20 01 74 0d 8b 55 bc <f6> 42 69 30 0f 95 c2 0f b6 d2 8b 45 d0 c7 45 d4 00 00 00 00 89 EIP: [<c11a4a02>] i915_gem_do_execbuffer+0x71f/0xbb6 SS:ESP 0068:f6508dd4 CR2: 0000000000000069 ---[ end trace 3f1d514b34d39381 ]--- Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-08-01 19:03:42 -07:00
Dave Airlie	d656ae53f6	Merge tag 'v2.6.35-rc6' into drm-radeon-next Need this to avoid conflicts with future radeon fixes	2010-08-02 10:05:24 +10:00
Linus Torvalds	f4b23cc2d5	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/r600: fix possible NULL pointer derefernce drm/radeon/kms: add quirk for ASUS HD 3600 board include/linux/vgaarb.h: add missing part of include guard drm/nouveau: Fix crashes during fbcon init on single head cards. drm/nouveau: fix pcirom vbios shadow breakage from acpi rom patch drm/radeon/kms: fix shared ddc harder drm/i915: enable low power render writes on GEN3 hardware. drm/i915: Define MI_ARB_STATE bits vmwgfx: return -EFAULT if copy_to_user fails fb: handle allocation failure in alloc_apertures() drm: radeon: check kzalloc() result drm/ttm: Fix build on architectures without AGP drm/radeon/kms: fix gtt MC base alignment on rs4xx/rs690/rs740 asics drm/radeon/kms: fix possible mis-detection of sideport on rs690/rs740 drm/radeon/kms: fix legacy tv-out pal mode	2010-07-20 18:29:25 -07:00
Dave Airlie	944001201c	drm/i915: enable low power render writes on GEN3 hardware. A lot of 945GMs have had stability issues for a long time, this manifested as X hangs, blitter engine hangs, and lots of crashes. one such report is at: https://bugs.freedesktop.org/show_bug.cgi?id=20560 along with numerous distro bugzillas. This only took a week of digging and hair ripping to figure out. Tracked down and tested on a 945GM Lenovo T60, previously running x11perf -copypixwin500 or x11perf -copywinpix500 repeatedly would cause the GPU to wedge within 4 or 5 tries, with random busy bits set. After this patch no hangs were observed. cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-07-20 15:24:18 +10:00
Dave Chinner	7f8275d0d6	mm: add context argument to shrinker callback The current shrinker implementation requires the registered callback to have global state to work from. This makes it difficult to shrink caches that are not global (e.g. per-filesystem caches). Pass the shrinker structure to the callback so that users can embed the shrinker structure in the context the shrinker needs to operate on and get back to it in the callback via container_of(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de>	2010-07-19 14:56:17 +10:00
Linus Torvalds	cd9f040df6	drm/i915: add 'reclaimable' to i915 self-reclaimable page allocations The hibernate issues that got fixed in commit `985b823b91` ("drm/i915: fix hibernation since i915 self-reclaim fixes") turn out to have been incomplete. Vefa Bicakci tested lots of hibernate cycles, and without the __GFP_RECLAIMABLE flag the system eventually fails to resume. With the flag added, Vefa can apparently hibernate forever (or until he gets bored running his automated scripts, whichever comes first). The reclaimable flag was there originally, and was one of the flags that were dropped (unintentionally) by commit `4bdadb9785` ("drm/i915: Selectively enable self-reclaim") that introduced all these problems, but I didn't want to just blindly add back all the flags in commit `985b823b91`, and it looked like __GFP_RECLAIM wasn't necessary. It clearly was. I still suspect that there is some subtle reason we're missing that causes the problems, but __GFP_RECLAIMABLE is certainly not wrong to use in this context, and is what the code historically used. And we have no idea what the causes the corruption without it. Reported-and-tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: Dave Airlie <airlied@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-18 09:44:37 -07:00
Daniel Vetter	db3307a9f7	drm: kill drm_mm_node->private Only ever assigned, never used. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> [glisse: I will re-add if needed for range-restricted allocations] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-07-07 12:26:44 +10:00
Linus Torvalds	985b823b91	drm/i915: fix hibernation since i915 self-reclaim fixes Since commit `4bdadb9785` ("drm/i915: Selectively enable self-reclaim"), we've been passing GFP_MOVABLE to the i915 page allocator where we weren't before due to some over-eager removal of the page mapping gfp_flags games the code used to play. This caused hibernate on Intel hardware to result in a lot of memory corruptions on resume. See for example http://bugzilla.kernel.org/show_bug.cgi?id=13811 Reported-by: Evengi Golov (in bugzilla) Signed-off-by: Dave Airlie <airlied@redhat.com> Tested-by: M. Vefa Bicakci <bicave@superonline.com> Cc: stable@kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-07-01 18:37:01 -07:00
Chris Wilson	ab34c22681	drm/i915: Fix up address spaces in slow_kernel_write() Since we now get_user_pages() outside of the mutex prior to performing the copy, we kmap() the page inside the copy routine and so need to perform an ordinary memcpy() and not copy_from_user(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:03:29 -07:00
Chris Wilson	99a03df57c	drm/i915: Use non-atomic kmap for slow copy paths As we do not have a requirement to be atomic and avoid sleeping whilst performing the slow copy for shmem based pread and pwrite, we can use kmap instead, thus simplifying the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:02:36 -07:00
Chris Wilson	9b8c4a0b21	drm/i915: Avoid moving from CPU domain during pwrite We can avoid an early clflush when pwriting if we use the current CPU write domain rather than moving the object to the GTT domain for the purposes of the pwrite. This has the advantage of not flushing the presumably hot data that we want to upload into the bo, and of ascribing the clflush to the execution when profiling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:02:00 -07:00
Chris Wilson	68f95ba9e2	drm/i915: Cleanup after failed initialization of ringbuffers The callers expect us to cleanup any partially initialised structures before reporting the error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 11:01:02 -07:00
Chris Wilson	654fc6073f	drm/i915: Reject bind_to_gtt() early if object > aperture If the object is bigger than the entire aperture, reject it early before evicting everything in a vain attempt to find space. v2: Use E2BIG as suggested by Owain G. Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:52:15 -07:00
Chris Wilson	3d1cc47037	drm/i915: Remove spurious warning "Failure to install fence" This particular warning is harmless as we emit during the normal pinning process where the batch buffer requires more fences than is available without eviction. Only if we fail to evict enough fences does this become a problem, so include the requested number of fences in the ultimate error message. v2: Remember to compile test even trial patches to remove warnings. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:49:38 -07:00
Chris Wilson	ac0c6b5ad3	drm/i915: Rebind bo if currently bound with incorrect alignment. Whilst pinning the buffer, check that that its current alignment matches the requested alignment. If it does not, rebind. This should clear up any final render errors whilst resuming, for reference: Bug 27070 - [i915] Page table errors with empty ringbuffer https://bugs.freedesktop.org/show_bug.cgi?id=27070 Bug 15502 - render error detected, EIR: 0x00000010 https://bugzilla.kernel.org/show_bug.cgi?id=15502 Bug 13844 - i915 error: "render error detected" https://bugzilla.kernel.org/show_bug.cgi?id=13844 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:43:38 -07:00
Chris Wilson	808b24d6ed	drm/i915: Propagate error from unbinding an unfenceable object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:42:52 -07:00
Chris Wilson	b118c1e363	drm/i915: Avoid nesting of domain changes when setting display plane Nesting domain changes will cause confusion when trying to interpret the tracepoints describing the sequence of changes for the object, as well as obscuring the order of operations for the reader of the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-28 10:42:04 -07:00
Daniel Vetter	778c35444f	drm/i915: combine all small integers into one single bitfield This saves a whooping 7 dwords. Zero functional changes. Because some of the refcounts are rather tightly calculated, I've put BUG_ONs in the code to check for overflows. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 14:13:36 -07:00
Zou Nan hai	d1b851fc0d	drm/i915: implement BSD ring buffer V2 The BSD (bit stream decoder) ring is used for accessing the BSD engine which decodes video bitstream for H.264 and VC1 on G45+. It is asynchronous with the render ring and has access to separate parts of the GPU from it, though the render cache is coherent between the two. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:46:58 -07:00
Zou Nan hai	852835f343	drm/i915: convert some gem structures to per-ring V2 The active list and request list move into the ringbuffer structure, so each can track its active objects in the order they are in that ring. The flushing list does not, as it doesn't matter which ring caused data to end up in the render cache. Objects gain a pointer to the ring they are active on (if any). Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:42:11 -07:00
Zou Nan hai	8187a2b70e	drm/i915: introduce intel_ring_buffer structure (V2) Introduces a more complete intel_ring_buffer structure with callbacks for setup and management of a particular ringbuffer, and converts the render ring buffer consumers to use it. Signed-off-by: Zou Nan hai <nanhai.zou@intel.com> Signed-off-by: Xiang Hai hao <haihao.xiang@intel.com> [anholt: Fixed up whitespace fail and rebased against prep patches] Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-26 13:24:49 -07:00
Eric Anholt	d3301d86b4	drm/i915: Rename dev_priv->ring to dev_priv->render_ring. With the advent of the BSD ring, be clear about which ring this is. The docs are pretty consistent with calling this the Render engine at this point.	2010-05-26 12:36:00 -07:00
Eric Anholt	62fdfeaf8b	drm/i915: Move ringbuffer-related code to intel_ringbuffer.c. This is preparation for supporting multiple ringbuffers on Ironlake. The non-copy-and-paste changes are: - de-staticing functions - I915_GEM_GPU_DOMAINS moving to i915_drv.h to be used by both files. - i915_gem_add_request had only half its implementation copy-and-pasted out of the middle of it.	2010-05-26 12:36:00 -07:00
Daniel Vetter	007cc8ac4e	drm/i915: move fence lru to struct drm_i915_fence_reg This lru tracks fences, not objects, so move it to where it belongs. As a side effect, this nicely shrinks drm_i915_gem_object by two pointers. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-10 13:38:31 -07:00
Eric Anholt	34dc4d4423	Merge remote branch 'origin/master' into drm-intel-next Conflicts: drivers/gpu/drm/i915/i915_dma.c drivers/gpu/drm/i915/i915_drv.h drivers/gpu/drm/radeon/r300.c The BSD ringbuffer support that is landing in this branch significantly conflicts with the Ironlake PIPE_CONTROL fix on master, and requires it to be tested successfully anyway.	2010-05-10 13:36:52 -07:00
Chris Wilson	1637ef413b	drm/i915: Wait for the GPU whilst shrinking, if truly desperate. By idling the GPU and discarding everything we can when under extreme memory pressure, the number of OOM-killer events is dramatically reduced. For instance, this makes it possible to run firefox-planet-gnome.trace again on my swapless 512MiB i915. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-05-07 13:59:28 -07:00
Jesse Barnes	1918ad77f7	drm/i915: fix non-Ironlake 965 class crashes My PIPE_CONTROL fix (just sent via Eric's tree) was buggy; I was testing a whole set of patches together and missed a conversion to the new HAS_PIPE_CONTROL macro, which will cause breakage on non-Ironlake 965 class chips. Fortunately, the fix is trivial and has been tested. Be sure to use the HAS_PIPE_CONTROL macro in i915_get_gem_seqno, or we'll end up reading the wrong graphics memory, likely causing hangs, crashes, or worse. Reported-by: Zdenek Kabelac <zdenek.kabelac@gmail.com> Reported-by: Toralf Förster <toralf.foerster@gmx.de> Tested-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-23 10:39:20 -07:00
Jesse Barnes	e552eb7038	drm/i915: use PIPE_CONTROL instruction on Ironlake and Sandy Bridge Since 965, the hardware has supported the PIPE_CONTROL command, which provides fine grained GPU cache flushing control. On recent chipsets, this instruction is required for reliable interrupt and sequence number reporting in the driver. So add support for this instruction, including workarounds, on Ironlake and Sandy Bridge hardware. https://bugs.freedesktop.org/show_bug.cgi?id=27108 Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-04-22 14:48:55 -07:00
Daniel Vetter	a8089e849a	drm/i915: drop pointer to drm_gem_object Luckily the change is quite a little bit less invasive than I've feared. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:23:14 +10:00
Daniel Vetter	62b8b21515	drm/i915: don't use ->driver_private anymore Thanks to the to_intel_bo helper, this change is rather trivial. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:56 +10:00
Daniel Vetter	c397b9084c	drm/i915: embed the gem object into drm_i915_gem_object Just embed it and adjust the pointers, No other changes (that's for later patches). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:45 +10:00
Daniel Vetter	ac52bc56de	drm/i915: introduce i915_gem_alloc_object Just preparation, no functional change. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:22:26 +10:00
Daniel Vetter	fd632aa34c	drm: free core gem object from driver callbacks When drivers embed the core gem object into their own structures, they'll have to do this. Temporarily this results in an ugly kfree(gem_obj); in every gem driver. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-04-20 13:19:33 +10:00
Daniel Vetter	c36a2a6de5	drm/i915: fix tiling limits for i915 class hw v2 Current code is definitely crap: Largest pitch allowed spills into the TILING_Y bit of the fence registers ... :( I've rewritten the limits check under the assumption that 3rd gen hw has a 3d pitch limit of 8kb (like 2nd gen). This is supported by an otherwise totally misleading XXX comment. This bug mostly resulted in tiling-corrupted pixmaps because the kernel allowed too wide buffers to be tiled. Bug brought to the light by the xf86-video-intel 2.11 release because that unconditionally enabled tiling for pixmaps, relying on the kernel to check things. Tiling for the framebuffer was not affected because the ddx does some additional checks there ensure the buffer is within hw-limits. v2: Instead of computing the value that would be written into the hw fence registers and then checking the limits simply check whether the stride is above the 8kb limit. To better document the hw, add some WARN_ONs in i915_write_fence_reg like I've done for the i830 case (using the right limits). Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=27449 Tested-by: Alexander Lam <lambchop468@gmail.com> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-04-18 17:58:24 -07:00
Linus Torvalds	13bd8e4673	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: drm/i915: Ignore LVDS EDID when it is unavailabe or invalid drm/i915: Add no_lvds entry for the Clientron U800 drm/i915: Rename many remaining uses of "output" to encoder or connector. drm/i915: Rename intel_output to intel_encoder. agp/intel: intel_845_driver is an agp driver! drm/i915: introduce to_intel_bo helper drm/i915: Disable FBC on 915GM and 945GM.	2010-04-17 14:28:50 -07:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Daniel Vetter	23010e43b3	drm/i915: introduce to_intel_bo helper This is a purely cosmetic change to make changes in this area easier. And hey, it's not only clearer and typechecked, but actually shorter, too! [anholt: To clarify, this is a change to let us later make drm_i915_gem_object subclass drm_gem_object, instead of having drm_gem_object have a pointer to i915's private data] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-25 11:06:17 -07:00
Chris Wilson	1f2b10131f	drm/i915: Avoid NULL deref in get_pages() unwind after error. Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=15527 NULL pointer dereference in i915_gem_object_save_bit_17_swizzle BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<f82b5d2b>] i915_gem_object_save_bit_17_swizzle+0x5b/0xc0 [i915] Call Trace: [<f82aea55>] ? i915_gem_object_put_pages+0x125/0x150 [i915] [<f82aeb71>] ? i915_gem_object_get_pages+0xf1/0x110 [i915] [<f82b0de8>] ? i915_gem_object_bind_to_gtt+0xb8/0x2a0 [i915] [<c02db74d>] ? drm_mm_get_block_generic+0x4d/0x180 [<f82b11cd>] ? i915_gem_mmap_gtt_ioctl+0x16d/0x240 [i915] [<f82ae786>] ? i915_gem_madvise_ioctl+0x86/0x120 [i915] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: maciej.rutecki@gmail.com Cc: stable@kernel.org Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 13:17:24 -07:00
Eric Anholt	71cf39b117	drm/i915: Enable VS timer dispatch. This could resolve HW deadlocks where a unit downstream of the VS is waiting for more input, the VS has one vertex queued up but not dispatched because it hopes to get one more vertex for 2x4 dispatch, and software isn't handing more vertices down because it's waiting for rendering to complete. The B-Spec says you should always have this bit set. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 12:59:32 -07:00
Owain G. Ainsworth	5d9391628e	drm/i915: remove an unnecessary wait_request() The continue just after this call with loop around and wait for the request just added just fine. This leads to slightly more compact code. Signed-Off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-03-17 12:59:30 -07:00
Daniel Vetter	16edd55029	drm/i915: check for multiple write domains in pin_and_relocate The assumption that an object has only ever one write domain is deeply threaded into gem (it's even encoded the the singular of the variable name). Don't let userspace screw us over. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:22 -08:00
Daniel Vetter	922a2efc1b	drm/i915: clean-up i915_gem_flush_gpu_write_domain Now that we have an exact gpu write domain tracking, we don't need to move objects to the active list ourself. i915_add_request will take care of that under all circumstances. Idea stolen from a patch by Chris Wilson <chris@chris-wilson.co.uk>. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:22 -08:00
Daniel Vetter	4df2faf451	drm/i915: reuse i915_gpu_idle helper We have it, so use it. This required moving the function to avoid a forward declaration. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	6356039653	drm/i915: ensure lru ordering of fence_list The fence_list should be lru ordered for otherwise we might try to steal a fence reg from an active object even though there are fences from inactive objects available. lru ordering was obeyed for gpu access everywhere save when moving dirty objects from flushing_list to active_list. Fixing this cause the code to indent way to much, so I've extracted the flushing_list processing logic into its on function. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	ae3db24aab	drm/i915: extract fence stealing code The spaghetti logic in there tripped up my brain's code parser for a few secs. Prevent this from happening again by extracting the fence stealing code into a seperate functions. IMHO this slightly clears up the code flow. v2: Beautified according to ickle's comments. v3: ickle forgot to flush his comment queue ... Now there's also a we-are-paranoid BUG_ON in there. v4: I've forgotten to switch on my brain when doing v3. Now the BUG_ON actually checks something useful. v5: Clean up a stale comment as noted by Eric Anholt. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	4a87b8ca21	drm/i915: fixup active list locking in object_unbind All other accesses take this spinlock, so do this here, too. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Daniel Vetter	798750e30d	drm/i915: reuse i915_gem_object_put_fence_reg for fence stealing code This has a few functional changes against the old code: * a few more unnecessary loads and stores to the drm_i915_fence_reg objects. Also an unnecessary store to the hw fence register. * zaps any userspace mappings before doing other flushes. Only changes anything when userspace does racy stuff against itself. * also flush GTT domain. This is a noop, but still try to keep the bookkeeping correct. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:21 -08:00
Eric Anholt	f6e450a641	drm/i915: Fix sandybridge status page setup. The register's moved to the same location as the one for the BCS, it seems. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:18 -08:00
Eric Anholt	4e901fdc26	drm/i915: Set up fence registers on sandybridge. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:18 -08:00
Eric Anholt	bad720ff3e	drm/i915: Add initial bits for VGA modesetting bringup on Sandybridge. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-26 13:23:17 -08:00
Dave Airlie	30d6c72c4a	Merge remote branch 'anholt/drm-intel-next' into drm-next-stage * anholt/drm-intel-next: drm/i915: Record batch buffer following GPU error drm/i915: give up on 8xx lid status drm/i915: reduce some of the duplication of tiling checking drm/i915: blow away userspace mappings before fence change drm/i915: move a gtt flush to the correct place agp/intel: official names for Pineview and Ironlake drm/i915: overlay: drop superflous gpu flushes drm/i915: overlay: nuke readback to flush wc caches drm/i915: provide self-refresh status in debugfs drm/i915: provide FBC status in debugfs drm/i915: fix drps disable so unload & re-load works drm/i915: Fix OGLC performance regression on 945 drm/i915: Deobfuscate the render p-state obfuscation drm/i915: add dynamic performance control support for Ironlake drm/i915: enable memory self refresh on 9xx drm/i915: Don't reserve compatibility fence regs in KMS mode. drm/i915: Keep MCHBAR always enabled drm/i915: Replace open-coded eviction in i915_gem_idle()	2010-02-25 13:39:36 +10:00
Dave Airlie	de19322d55	Merge remote branch 'korg/drm-core-next' into drm-next-stage * korg/drm-core-next: drm/ttm: handle OOM in ttm_tt_swapout drm/radeon/kms/atom: fix shr/shl ops drm/kms: fix spelling of "CLOCK" drm/kms: fix fb_changed = true else statement drivers/gpu/drm/drm_fb_helper.c: don't use private implementation of atoi() drm: switch all GEM/KMS ioctls to unlocked ioctl status. Use drm_gem_object_[handle_]unreference_unlocked where possible drm: introduce drm_gem_object_[handle_]unreference_unlocked	2010-02-25 13:39:29 +10:00
Owain Ainsworth	f590d279eb	drm/i915: reduce some of the duplication of tiling checking i915_gem_object_fenceable was mostly just a repeat of the i915_gem_object_fence_offset_ok, but also checking the size (which was checkecd when we allowed that BO to be tiled in the first place). So instead, export the latter function and use it in place. Signed-Off-By: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:42 -05:00
Daniel Vetter	10ae9bd25a	drm/i915: blow away userspace mappings before fence change This aligns it with the other user of i915_gem_clear_fence_reg, which blows away the mapping before changing the fence reg. Only affects userspace if it races against itself when changing tiling parameters, i.e. behaviour is undefined, anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:42 -05:00
Daniel Vetter	4a7266123f	drm/i915: move a gtt flush to the correct place No functional change, because gtt flushing is a no-op. Still, try to keep the bookkeeping accurate. The if is still slightly wrong for with execbuf2 even i915-class hw doesn't always need a fence reg for gpu access. But that's for somewhen lateron. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-22 11:54:41 -05:00
Eric Anholt	b397c836ef	drm/i915: Don't reserve compatibility fence regs in KMS mode. The fence start is for compatibility with UMS X Servers before fence management. KMS X Servers only started doing tiling after fence management appeared. Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-16 11:48:44 -08:00
Chris Wilson	29105ccc43	drm/i915: Replace open-coded eviction in i915_gem_idle() With the introduction of the hang-check, we can safely expect that i915_wait_request() will always return even when the GPU hangs, and so do not need to open code the wait in order to manually check for the hang. Also we do not need to always evict all buffers, so only flush the GPU (and wait for it to idle) for KMS, but continue to evict for UMS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-16 11:48:43 -08:00
Luca Barbieri	bc9025bdc4	Use drm_gem_object_[handle_]unreference_unlocked where possible Mostly obvious simplifications. The i915 pread/pwrite ioctls, intel_overlay_put_image and nouveau_gem_new were incorrectly using the locked versions without locking: this is also fixed in this patch. Signed-off-by: Luca Barbieri <luca@luca-barbieri.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-02-11 14:22:34 +10:00
Owain Ainsworth	a40e8d3139	drm/i915: Correctly return -ENOMEM on allocation failure in cmdbuf ioctls. Signed-off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-10 15:08:00 -08:00
Daniel Vetter	99fcb766a3	drm/i915: Update write_domains on active list after flush. Before changing the status of a buffer with a pending write we will await upon a new flush for that buffer. So we can take advantage of any flushes posted whilst the buffer is active and pending processing by the GPU, by clearing its write_domain and updating its last_rendering_seqno -- thus saving a potential flush in deep queues and improves flushing behaviour upon eviction for both GTT space and fences. In order to reduce the time spent searching the active list for matching write_domains, we move those to a separate list whose elements are the buffers belong to the active/flushing list with pending writes. Orignal patch by Chris Wilson <chris@chris-wilson.co.uk>, forward-ported by me. In addition to better performance, this also fixes a real bug. Before this changes, i915_gem_evict_everything didn't work as advertised. When the gpu was actually busy and processing request, the flush and subsequent wait would not move active and dirty buffers to the inactive list, but just to the flushing list. Which triggered the BUG_ON at the end of this function. With the more tight dirty buffer tracking, all currently busy and dirty buffers get moved to the inactive list by one i915_gem_flush operation. I've left the BUG_ON I've used to prove this in there. References: Bug 25911 - 2.10.0 causes kernel oops and system hangs http://bugs.freedesktop.org/show_bug.cgi?id=25911 Bug 26101 - [i915] xf86-video-intel 2.10.0 (and git) triggers kernel oops within seconds after login http://bugs.freedesktop.org/show_bug.cgi?id=26101 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Adam Lantos <hege@playma.org> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-10 13:31:45 -08:00
Linus Torvalds	f6510ec5a9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: drm/i915: Fix leak of relocs along do_execbuffer error path drm/i915: slow acpi_lid_open() causes flickering - V2 drm/i915: Disable SR when more than one pipe is enabled drm/i915: page flip support for Ironlake drm/i915: Fix the incorrect DMI string for Samsung SX20S laptop drm/i915: Add support for SDVO composite TV drm/i915: don't trigger ironlake vblank interrupt at irq install drm/i915: handle non-flip pending case when unpinning the scanout buffer drm/i915: Fix the device info of Pineview drm/i915: enable vblank interrupt on ironlake drm/i915: Prevent use of uninitialized pointers along error path. drm/i915: disable hotplug detect before Ironlake CRT detect	2010-02-06 13:01:39 -08:00
Chris Wilson	93533c291a	drm/i915: Fix leak of relocs along do_execbuffer error path Following a gpu hang, we would leak the relocation buffer. So simply earrange the error path to always free the relocation buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-02-04 09:04:01 -08:00
Chris Wilson	4bdadb9785	drm/i915: Selectively enable self-reclaim Having missed the ENOMEM return via i915_gem_fault(), there are probably other paths that I also missed. By not enabling NORETRY by default these paths can run the shrinker and take memory from the system (but not from our own inactive lists because our shrinker can not run whilst we hold the struct mutex) and this may allow the system to survive a little longer whilst our drivers consume all available memory. References: OOM killer unexpectedly called with kernel 2.6.32 http://bugzilla.kernel.org/show_bug.cgi?id=14933 v2: Pass gfp into page mapping. v3: Use new read_cache_page_gfp() instead of open-coding. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-27 09:26:43 -08:00
Chris Wilson	0ce907f891	drm/i915: Prevent use of uninitialized pointers along error path. X.org hang with [drm:i915_gem_do_execbuffer] ERROR in dmesg http://bugzilla.kernel.org/show_bug.cgi?id=15114 Matej found he was hitting an error path within i915_gem_do_execbuffer() that led to the attempt to dereference an uninitialised pointer during cleanup. This path used to be safe as we used to calloc the object lists, but this was changed in `c8e0f93`. Daniel Vetter had also spotted this error and proposed a similar patch. [ 6379.732892] [drm:i915_gem_do_execbuffer] ERROR Object ffff880098cd6540 appears more than once in object list [ 6379.740976] [drm:i915_gem_do_execbuffer] ERROR Object ffff880098cd6540 appears more than once in object list [ 6379.740995] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0 [ 6379.740998] IP: [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741006] PGD babab067 PUD bb435067 PMD 0 [ 6379.741010] Oops: 0002 [#1] PREEMPT SMP [ 6379.741014] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.2/0000:06:00.0/ieee80211/phy0/rfkill0/state [ 6379.741017] CPU 1 [ 6379.741021] Pid: 2186, comm: X Not tainted 2.6.33-rc4-00399-g24bc734 #142 M11D/ESPRIMO Mobile M9400 [ 6379.741023] RIP: 0010:[<ffffffff8122ddb5>] [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741027] RSP: 0018:ffff8800b9047b78 EFLAGS: 00213206 [ 6379.741029] RAX: 0000000000000000 RBX: 000000000000004f RCX: ffff880098cac800 [ 6379.741032] RDX: ffff880098caca78 RSI: ffff8800b9047c98 RDI: ffff880098cd6540 [ 6379.741034] RBP: ffff8800b9047c78 R08: ffffffff814b96b5 R09: 0000000000000006 [ 6379.741036] R10: 0000000000000000 R11: 0000000000000003 R12: 000000000000004e [ 6379.741038] R13: 00000000fffffff7 R14: 0000000000000000 R15: 0000000000000001 [ 6379.741041] FS: 0000000000000000(0000) GS:ffff880001900000(0063) knlGS:00000000f72636c0 [ 6379.741043] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 6379.741041] FS: 0000000000000000(0000) GS:ffff880001900000(0063) knlGS:00000000f72636c0 [ 6379.741043] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 6379.741045] CR2: 00000000000000a0 CR3: 00000000b9000000 CR4: 00000000000006e0 [ 6379.741048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6379.741050] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 6379.741052] Process X (pid: 2186, threadinfo ffff8800b9046000, task ffff8800bb5d8000) [ 6379.741054] Stack: [ 6379.741055] ffffc90023f57000 ffffc90023f56fff ffffc90023f56fff ffffc90023f55000 [ 6379.741059] <0> ffff8800b9047c98 ffff8800bb43c840 ffff8800bf1de800 ffff8800bf1de820 [ 6379.741063] <0> ffff8800b9047bd8 ffff880098cac800 0000000000000000 0000000000000002 [ 6379.741068] Call Trace: [ 6379.741072] [<ffffffff8122e6cb>] ? i915_gem_execbuffer+0x6b/0x370 [ 6379.741077] [<ffffffff810a5f52>] ? __vmalloc_node+0xa2/0xb0 [ 6379.741080] [<ffffffff8122e6cb>] ? i915_gem_execbuffer+0x6b/0x370 [ 6379.741083] [<ffffffff8122e816>] i915_gem_execbuffer+0x1b6/0x370 [ 6379.741086] [<ffffffff8120cd55>] drm_ioctl+0x1d5/0x460 [ 6379.741089] [<ffffffff8122e660>] ? i915_gem_execbuffer+0x0/0x370 [ 6379.741093] [<ffffffff81248c35>] i915_compat_ioctl+0x45/0x50 [ 6379.741097] [<ffffffff810f1659>] compat_sys_ioctl+0xa9/0x1570 [ 6379.741102] [<ffffffff810b1d5c>] ? vfs_read+0x13c/0x1a0 [ 6379.741106] [<ffffffff81028424>] sysenter_dispatch+0x7/0x2b [ 6379.741108] Code: 08 85 c0 74 52 31 db 0f 1f 80 00 00 00 00 48 63 c3 48 8b 8d 68 ff ff ff 48 8d 14 c1 48 8b 02 48 85 c0 74 25 48 8b 80 80 00 00 00 <c7> 80 a0 00 00 00 00 00 00 00 48 8b 3a 48 85 ff 74 0c 48 c7 c6 [ 6379.741142] RIP [<ffffffff8122ddb5>] i915_gem_do_execbuffer+0xba5/0x1260 [ 6379.741145] RSP <ffff8800b9047b78> [ 6379.741147] CR2: 00000000000000a0 [ 6379.741159] ---[ end trace 0598809afa4c31db ]--- Reported-by: Matej Laitl <strohel@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-25 09:00:38 -08:00
Eric Anholt	6036ae7e94	drm/i915: Remove chatty execbuf failure message. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> (in principle) Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-15 13:05:36 -08:00
Zhenyu Wang	b9241ea31f	drm/i915: Don't wait interruptible for possible plane buffer flush When we setup buffer for display plane, we'll check any pending required GPU flush and possible make interruptible wait for flush complete. But that wait would be most possibly to fail in case of signals received for X process, which will then fail modeset process and put display engine in unconsistent state. The result could be blank screen or CPU hang, and DDX driver would always turn on outputs DPMS after whatever modeset fails or not. So this one creates new helper for setup display plane buffer, and when needing flush using uninterruptible wait for that. This one should fix bug like https://bugs.freedesktop.org/show_bug.cgi?id=24009. Also fixing mode switch stress test on Ironlake. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-12 15:07:34 -08:00
Linus Torvalds	2c1f1895ef	Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/radeon/kms: rs600: use correct mask for SW interrupt gpu/drm/radeon/radeon_irq.c: move a dereference below a NULL test drm/radeon/radeon_device.c: move a dereference below a NULL test drm/radeon/radeon_fence.c: move a dereference below the NULL test drm/radeon/radeon_connectors.c: add a NULL test before dereference drm/radeon/kms: fix memory leak drm/kms: Fix &&/\|\| confusion in drm_fb_helper_connector_parse_command_line() drm/edid: Fix CVT width/height decode drm/edid: Skip empty CVT codepoints drm: remove address mask param for drm_pci_alloc() drm/radeon/kms: add missing breaks in i2c and ss lookups drm/radeon/kms: add primary dac adj values table drm/radeon/kms: fallback to default connector table	2010-01-06 20:26:42 -08:00
Zhenyu Wang	e6be8d9d17	drm: remove address mask param for drm_pci_alloc() drm_pci_alloc() has input of address mask for setting pci dma mask on the device, which should be properly setup by drm driver. And leave it as a param for drm_pci_alloc() would cause confusion or mistake would corrupt the correct dma mask setting, as seen on intel hw which set wrong dma mask for hw status page. So remove it from drm_pci_alloc() function. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2010-01-07 13:15:50 +10:00
Chris Wilson	e3d8affb0d	drm/i915: Permit pinning whilst the device is 'suspended' As pinning (allocating and binding GTT memory) does not actually invoke GPU commands, it is safe, and indeed is attempted, during resumption from suspension: [drm:intel_init_clock_gating] ERROR failed to pin power context: -16 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-06 09:40:11 -08:00
Jesse Barnes	76446cac68	drm/i915: execbuf2 support This patch adds a new execbuf ioctl, execbuf2, for use by clients that want to control fence register allocation more finely. The buffer passed in to the new ioctl includes a new relocation type to indicate whether a given object needs a fence register assigned for the command buffer in question. Compatibility with the existing execbuf ioctl is implemented in terms of the new code, preserving the assumption that fence registers are required for pre-965 rendering commands. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> [ickle: Remove pre-emptive clear_fence_reg()] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> [anholt: Removed dmesg spam] Signed-off-by: Eric Anholt <eric@anholt.net>	2010-01-06 09:39:39 -08:00
Daniel Vetter	96b47b6559	drm/i915: fix order of fence release wrt flushing i915_gem_object_unbind had the ordering wrong. The other user, i915_gem_object_put_fence_reg already has the correct ordering. Results was usually corrupted pixmaps, especially garbled font glyphs after a suspend/resume (because this evicts everything). I'm still waiting for the feedback from the bug-reporters, but because this obviously fixes a bug (at least for me) I'm already submitting it. Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=25406 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net> CC: stable@kernel.org	2009-12-16 09:18:37 -08:00
Linus Torvalds	3ef884b4c0	Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (189 commits) drm/radeon/kms: fix warning about cur_placement being uninitialised. drm/ttm: Print debug information on memory manager when eviction fails drm: Add memory manager debug function drm/radeon/kms: restore surface registers on resume. drm/radeon/kms/r600/r700: fallback gracefully on ucode failure drm/ttm: Initialize eviction placement in case the driver callback doesn't drm/radeon/kms: cleanup structure and module if initialization fails drm/radeon/kms: actualy set the eviction placements we choose drm/radeon/kms: Fix NULL ptr dereference drm/radeon/kms/avivo: add support for new pll selection algo drm/radeon/kms/avivo: fix some bugs in the display bandwidth setup drm/radeon/kms: fix return value from fence function. drm/radeon: Remove tests for -ERESTART from the TTM code. drm/ttm: Have the TTM code return -ERESTARTSYS instead of -ERESTART. drm/radeon/kms: Convert radeon to new TTM validation API (V2) drm/ttm: Rework validation & memory space allocation (V3) drm: Add search/get functions to get a block in a specific range drm/radeon/kms: fix avivo tiling regression since radeon object rework drm/i915: Remove a debugging printk from hangcheck drm/radeon/kms: make sure i2c id matches ...	2009-12-10 21:56:47 -08:00
Chris Wilson	5618ca6abc	drm/i915: Set the error code after failing to insert new offset into mm ht. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 15:44:30 -08:00
Adam Jackson	f2b115e69d	drm/i915: Fix product names and #defines IGD* isn't a useful name. Replace with the codenames, as sourced from pci.ids. Signed-off-by: Adam Jackson <ajax@redhat.com> [anholt: Fixed up for merge with pineview/ironlake changes] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 14:55:56 -08:00
Chris Wilson	ffb4728095	drm/i915: Drop a some common DRM_ERROR() These are handled by the error return being propagated to user-space and do not any add any information to the original error, so are useless. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-07 12:18:28 -08:00
André Goddard Rosa	af901ca181	tree-wide: fix assorted typos all over the place That is "success", "unknown", "through", "performance", "[re\|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over\|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by: André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2009-12-04 15:39:55 +01:00
Kristian Høgsberg	6b95a207c1	drm/i915: Add intel implementation of the pageflip ioctl Acked-by: Jakob Bornecrantz <jakob@vmware.com> Acked-by: Thomas Hellström <thomas@shipmail.org> Review-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse "Orange Smoothie" Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-12-01 09:10:35 -08:00
Eric Anholt	c8e0f93a38	drm/i915: Replace a calloc followed by copying data over it with malloc. Execbufs involve quite a bit of payload, to the extent that cache misses show up in the profiles here, and a suspicion that some of those cachelines may get evicted and then reloaded in the subsequent copy. This is still abstracted like drm_calloc_large since we want to check for size overflow, and because we want to choose between kmalloc and vmalloc on the fly. cairo's interface for malloc-with-calloc's-args was used as the model. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-25 06:36:21 -08:00
Zhao Yakui	44d98a6142	drm/i915: Replace DRM_DEBUG with DRM_DEBUG_DRIVER Replace the DRM_DEBUG with DRM_DEBUG_DRIVER in generic i915 driver. Then the debug info can be obtained by adding the boot option of "drm.debug=0x02". At the same time the debug info in increase/decrease clock is also printed by using DRM_DEBUG_DRIVER instead of DRM_DEBUG_KMS. Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:10 -08:00
Daniel Vetter	1df4b35b61	drm/i915: kill i915_lp_ring_sync It's not needed anymore. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:09 -08:00
Daniel Vetter	5a5a0c64a9	drm/i915: implement fastpath for overlay flip waiting As long as the gpu can keep up, neither the cpu (waiting for gpu) nore the gpu (waiting for vblank to do an overlay flip) stalls. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:09 -08:00
Daniel Vetter	48764bf43f	drm/i915: add i915_lp_ring_sync helper This just waits until the hw passed the current ring position with cmd execution. This slightly changes the existing i915_wait_request function to make uninterruptible waiting possible - no point in returning to userspace while mucking around with the overlay, that piece of hw is just too fragile. Also replace a magic 0 with the symbolic constant (and kill the then superflous comment) while I was looking at the code. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-11-05 14:47:07 -08:00
Chris Wilson	9d34e5db07	drm/i915: Enable irq to trace batch buffer completion. If we trigger a tracepoint for batch buffer submission, it is a reasonable assumption that we wish to also trace the batch buffer completion. So in order to capture the completion events, we need to enable irqs... However, we cannot rely on the completion event to disable the irq later, so we defer the irq disable to the retire request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:25 +01:00
Chris Wilson	8f0dc5bf17	drm/i915: batch submit seqno off-by-one. We increment the seqno number between submitting the batch buffer and the flush/interrupt that demarcates its end, so the tracepoint needs to reference the incremented value to match the completion event. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-29 03:15:24 +01:00
Linus Torvalds	94e0fb086f	Merge branch 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel * 'drm-intel-next' of git://git.kernel.org/pub/scm/linux/kernel/git/anholt/drm-intel: (57 commits) drm/i915: Handle ERESTARTSYS during page fault drm/i915: Warn before mmaping a purgeable buffer. drm/i915: Track purged state. drm/i915: Remove eviction debug spam drm/i915: Immediately discard any backing storage for uneeded objects drm/i915: Do not mis-classify clean objects as purgeable drm/i915: Whitespace correction for madv drm/i915: BUG_ON page refleak during unbind drm/i915: Search harder for a reusable object drm/i915: Clean up evict from list. drm/i915: Add tracepoints drm/i915: framebuffer compression for GM45+ drm/i915: split display functions by chip type drm/i915: Skip the sanity checks if the current relocation is valid drm/i915: Check that the relocation points to within the target drm/i915: correct FBC update when pipe base update occurs drm/i915: blacklist Acer AspireOne lid status ACPI: make ACPI button funcs no-ops if not built in drm/i915: prevent FIFO calculation overflows on 32 bits with high dotclocks drm/i915: intel_display.c handle latency variable efficiently ... Fix up trivial conflicts in drivers/gpu/drm/i915/{i915_dma.c\|i915_drv.h}	2009-09-24 10:30:41 -07:00
Chris Wilson	c715089f49	drm/i915: Handle ERESTARTSYS during page fault During a page fault and rebinding the buffer there exists a window for a signal to arrive during the i915_wait_request() and trigger a ERESTARTSYS. This used to be handled by returning SIGBUS and thereby killing the application. Try 'cairo-perf-trace & cairo-test-suite' and watch X go boom! The solution as suggested by H. Peter Anvin is to simply return NOPAGE and leave the higher layers to spot we did not fill the page and resubmit the page fault. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org [anholt: Mostly squash it with another commit]	2009-09-22 18:25:32 -07:00
Chris Wilson	ab18282d58	drm/i915: Warn before mmaping a purgeable buffer. Only allow the user to mmap buffers that have not been marked as purgeable. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:39 +01:00
Chris Wilson	bb6baf76f4	drm/i915: Track purged state. In order to correctly prevent the invalid reuse of a purged buffer, we need to track such events and warn the user before something bad happens. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:38 +01:00
Chris Wilson	9731129c5e	drm/i915: Remove eviction debug spam Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:38 +01:00
Chris Wilson	2d7ef395b3	drm/i915: Immediately discard any backing storage for uneeded objects Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:37 +01:00
Chris Wilson	963b483691	drm/i915: Do not mis-classify clean objects as purgeable Whilst cleaning up the patches for submission, I mis-classified non-dirty objects as purgeable. This was causing the backing pages for those objects to be evicted under memory-pressure, discarding valid and unreplaceable texture data. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:36 +01:00
Chris Wilson	13a05fd978	drm/i915: Whitespace correction for madv Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:35 +01:00
Chris Wilson	a32808c0a1	drm/i915: BUG_ON page refleak during unbind Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:10:21 +01:00
Chris Wilson	9a1e2582d8	drm/i915: Search harder for a reusable object As evict_something() is called by routines that do not repeatedly search again, try harder in the initial search to find an object that matches the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:05:28 +01:00
Chris Wilson	ab5ee57650	drm/i915: Clean up evict from list. First the routine attempted to unlock a mutex it did not own along the error path. Secondly the routine should never be called on any list but the inactive one, since we attempt to unbind those objects, so fix the calling semantics. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:05:26 +01:00
Chris Wilson	1c5d22f76d	drm/i915: Add tracepoints By adding tracepoint equivalents for WATCH_BUF/EXEC we are able to monitor the lifetimes of objects, requests and significant events. These events can then be probed using the tracing frameworks, such as systemtap and, in particular, perf. For example to record the stack trace for every GPU stall during a run, use $ perf record -e i915:i915_gem_request_wait_begin -c 1 -g And $ perf report to view the results. [Updated to fix compilation issues caused.] Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Ben Gamari <bgamari@gmail.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-09-23 01:05:21 +01:00
Linus Torvalds	44040f107e	Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (133 commits) drm/vgaarb: add VGA arbitration support to the drm and kms. drm/radeon: some r420s have a CP race with the DMA engine. drm/radeon/r600/kms: rv670 is not DCE3 drm/radeon/kms: r420 idle after programming GA_ENHANCE drm/radeon/kms: more fixes to rv770 suspend/resume path. drm/radeon/kms: more alignment for rv770.c with r600.c drm/radeon/kms: rv770 blit init called too late. drm/radeon/kms: move around new init path code to avoid posting at init drm/radeon/r600: fix some issues with suspend/resume. drm/radeon/kms: disable VGA rendering engine before taking over VRAM drm/radeon/kms: Move radeon_get_clock_info() call out of radeon_clocks_init(). drm/radeon/kms: add initial connector properties drm/radeon/kms: Use surfaces for scanout / cursor byte swapping on big endian. drm/radeon/kms: don't fail if we fail to init GPU acceleration drm/r600/kms: fixup number of loops per blit calculation. drm/radeon/kms: reprogram format in set base. drm/radeon: avivo chips have no separate int bit for display drm/radeon/r600: don't do interrupts drm: fix _DRM_GEM addmap error message drm: update crtc x/y when only fb changes ... Fixed up trivial conflicts in firmware/Makefile due to network driver (cxgb3) and drm (mga/r128/radeon) firmware being listed next to each other.	2009-09-21 08:10:09 -07:00
Chris Wilson	8542a0bbbb	drm/i915: Skip the sanity checks if the current relocation is valid If the presumed_offset as feed to userspace and returned to the kernel from a previous execbuffer is still valid, then we do not need to rewrite the relocation entry and may skip the offset sanity checks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-18 15:31:39 -07:00
Chris Wilson	cd0b9fb400	drm/i915: Check that the relocation points to within the target Eric noted a potential concern with the low bits not being strictly used as part of the absolute offset (instead part of the command stream to the GPU), but in practice that should not be an issue. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Andy Whitcroft <apw@canonical.com> Cc: Eric Anholt <eric@anholt.net> CC: stable@kernel.org Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-18 15:31:21 -07:00
Chris Wilson	07f73f6912	drm/i915: Improve behaviour under memory pressure Due to the necessity of having to take the struct_mutex, the i915 shrinker can not free the inactive lists if we fail to allocate memory whilst processing a batch buffer, triggering an OOM and an ENOMEM that is reported back to userspace. In order to fare better under such circumstances we need to manually retry a failed allocation after evicting inactive buffers. To do so involves 3 steps: 1. Marking the backing shm pages as NORETRY. 2. Updating the get_pages() callers to evict something on failure and then retry. 3. Revamping the evict something logic to be smarter about the required buffer size and prefer to use volatile or clean inactive pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:43:32 -07:00
Chris Wilson	3ef94daae7	drm/i915: Add ioctl to set 'purgeability' of objects Similar to the madvise() concept, the application may wish to mark some data as volatile. That is in the event of memory pressure the kernel is free to discard such buffers safe in the knowledge that the application can recreate them on demand, and is simply using these as a cache. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:43:31 -07:00
Chris Wilson	31169714fc	drm/i915: Register a shrinker to free inactive lists under memory pressure This should help GEM handle memory pressure sitatuions more gracefully. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:43:31 -07:00
Chris Wilson	e67b8ce1b5	drm/i915: Remove stored gtt_alignment There is no need to store the gtt_alignment as it is either explicitly set according to the hardware requirements (e.g. scanout) or the minimum alignment is computed on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:43:29 -07:00
Chris Wilson	4960aaca14	drm/i915: Add buffer to inactive list immediately during fault If we failed to set the domain, the buffer was no longer being tracked on any list. Cc: stable@kernel.org Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:43:28 -07:00
Ben Gamari	ba1234d17b	drm/i915: Make dev_priv->mm.wedged an atomic_t There is a very real possibility that multiple CPUs will notice that the GPU is wedged. This introduces all sorts of potential race conditions. Make the wedged flag atomic to mitigate this risk. Signed-off-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:36:46 -07:00
Ben Gamari	f65d94211e	drm/i915: Add hangcheck timer We set a periodic timer to check on the GPU, resetting it every time a batch is completed. If the timer elapses, we check acthd. If acthd hasn't changed in two timer periods, we assume the chip is wedged. This is implemented in such a way that it leaves the option open to employ adaptive timer intervals in the future. One could wait until several timer periods have elapsed before declaring the chip dead. If the chip comes back after several periods but before the "dead" threshold, the timer interval or dead threshold could be raised. It is important to note that while checking for active requests, we need to account for the fact that requests are removed from the list (i.e. retired) in a deferred work queue handler. This means that merely checking for an empty request_list is insufficient; the list could be non-empty yet the GPU still idle, causing the hangcheck timer to incorrectly mark the GPU as wedged (it took me a while to figure that out---sigh...) Signed-off-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:36:01 -07:00
Ben Gamari	22be172423	drm/i915: make i915_seqno_passed non-static We'll need it in i915_irq.c for checking whether there are outstanding requests. Also, the function really ought to return a bool, not an int. Signed-off-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:35:11 -07:00
Ben Gamari	ffed1d0920	drm/i915: Check whether chip is wedged in i915_wait_request() i915_wait_request() only checks mm.wedged after it interacts with the hardware, generally causing the driver to lock up waiting for a wedged chip. Make sure we check mm.wedged as the first thing we do. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-09-17 14:33:52 -07:00
Chris Wilson	7e61615857	drm/i915: Only destroy a constructed mmap offset drm_ht_remove_item() does not handle removing an absent item and the hlist in particular is incorrectly initialised. The easy remedy is simply skip calling i915_gem_free_mmap_offset() unless we have actually created the offset and associated ht entry. This also fixes the mishandling of a partially constructed offset which leaves pointers initialized after freeing them along the i915_gem_create_mmap_offset() error paths. In particular this should fix the oops found here: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/415357/comments/8 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org	2009-09-11 11:40:39 -07:00
Eric Anholt	e517a5e970	agp/intel: Fix the pre-9xx chipset flush. Ever since we enabled GEM, the pre-9xx chipsets (particularly 865) have had serious stability issues. Back in May a wbinvd was added to the DRM to work around much of the problem. Some failure remained -- easily visible by dragging a window around on an X -retro desktop, or by looking at bugzilla. The chipset flush was on the right track -- hitting the right amount of memory, and it appears to be the only way to flush on these chipsets, but the flush page was mapped uncached. As a result, the writes trying to clear the writeback cache ended up bypassing the cache, and not flushing anything! The wbinvd would flush out other writeback data and often cause the data we wanted to get flushed, but not always. By removing the setting of the page to UC and instead just clflushing the data we write to try to flush it, we get the desired behavior with no wbinvd. This exports clflush_cache_range(), which was laying around and happened to basically match the code I was otherwise going to copy from the DRM. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org> Cc: stable@kernel.org	2009-09-11 11:39:23 -07:00
Eric Anholt	5323fd042f	drm/i915: Zap mmaps of objects before unbinding them from the GTT. Otherwise, some other userland writing into its buffer may race to land writes either after the CPU thinks it's got a coherent view, or after its GTT entries have been redirected to point at the scratch page. Either result is unpleasant. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-09-09 12:52:05 -07:00
Linus Torvalds	e6890f6f3d	i915: disable interrupts before tearing down GEM state Reinette Chatre reports a frozen system (with blinking keyboard LEDs) when switching from graphics mode to the text console, or when suspending (which does the same thing). With netconsole, the oops turned out to be BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915] and it's due to the i915_gem.c code doing drm_irq_uninstall() after having done i915_gem_idle(). And the i915_gem_idle() path will do i915_gem_idle() -> i915_gem_cleanup_ringbuffer() -> i915_gem_cleanup_hws() -> dev_priv->hw_status_page = NULL; but if an i915 interrupt comes in after this stage, it may want to access that hw_status_page, and gets the above NULL pointer dereference. And since the NULL pointer dereference happens from within an interrupt, and with the screen still in graphics mode, the common end result is simply a silently hung machine. Fix it by simply uninstalling the irq handler before idling rather than after. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13819 Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-09-08 17:09:24 -07:00
Dave Airlie	11670d3c93	Merge intel drm-intel-next branch Merge remote branch 'anholt/drm-intel-next' of ../anholt-2.6 into drm-next Conflicts: drivers/gpu/drm/i915/intel_display.c drivers/gpu/drm/i915/intel_drv.h drivers/gpu/drm/i915/intel_sdvo.c	2009-09-07 20:27:20 +10:00
Chris Wilson	0ef82af725	drm/i915: Pad ringbuffer with NOOPs before wrapping According to the docs, the ringbuffer is not allowed to wrap in the middle of an instruction. G45 PRM, Vol 1b, p101: While the “free space” wrap may allow commands to be wrapped around the end of the Ring Buffer, the wrap should only occur between commands. Padding (with NOP) may be required to follow this restriction. Do as commanded. [Having seen bug reports where there is evidence of split commands, but apparently the GPU has continued on merrily before a bizarre and untimely death, this may or may not fix a few random hangs.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> CC: Eric Anholt <eric@anholt.net> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-09-06 11:29:06 -07:00
Jesse Barnes	652c393a33	drm/i915: add dynamic clock frequency control There are several sources of unnecessary power consumption on Intel graphics systems. The first is the LVDS clock. TFTs don't suffer from persistence issues like CRTs, and so we can reduce the LVDS refresh rate when the screen is idle. It will be automatically upclocked when userspace triggers graphical activity. Beyond that, we can enable memory self refresh. This allows the memory to go into a lower power state when the graphics are idle. Finally, we can drop some clocks on the gpu itself. All of these things can be reenabled between frames when GPU activity is triggered, and so there should be no user visible graphical changes. Signed-off-by: Jesse Barnes <jesse.barnes@intel.com> Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-09-04 13:05:38 -07:00
Chris Wilson	58c2fb647a	drm/i915: Unref old_obj on get_fence_reg() error path Remember to release the local reference if we fail to wait on the rendering. (Also whilst in the vicinity add some whitespace so that the phasing of the operations is clearer.) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-09-02 10:49:02 -07:00
Eric Anholt	a09ba7faf7	drm/i915: Fix CPU-spinning hangs related to fence usage by using an LRU. The lack of a proper LRU was partially worked around by taking the fence from the object containing the oldest seqno. But if there are multiple objects inactive, then they don't have seqnos and the first fence reg among them would be chosen. If you were trying to copy data between two mappings, this could result in each page fault stealing the fence from the other argument, and your application hanging. https://bugs.freedesktop.org/show_bug.cgi?id=23566 https://bugs.freedesktop.org/show_bug.cgi?id=23220 https://bugs.freedesktop.org/show_bug.cgi?id=23253 https://bugs.freedesktop.org/show_bug.cgi?id=23366 Cc: Stable Team <stable@kernel.org> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2009-08-29 17:37:21 -07:00
Pekka Paalanen	a1a2d1d322	drm: GEM handles are u32, not int Several functions in the GEM kernel API used int as handle type, but user API has it __u32 which is also the intended type. Replace int with u32. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-08-27 11:21:08 +10:00
Eric Anholt	9c9fe1f841	drm/i915: Use our own workqueue to avoid wedging the system along with the GPU. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-08-05 11:20:53 -07:00
Eric Anholt	d05ca30199	drm/i915: Zap the GTT mapping when transitioning from untiled to tiled. As of `52dc7d32b8`, we could leave an old linear GTT mapping in place, so that apps trying to GTT-mapped write in tiled data wouldn't get the fence added, and garbage would get displayed. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-07-10 14:10:58 -07:00
Chris Wilson	901782b21e	drm/i915: Refactor calls to unmap_mapping_range As we call unmap_mapping_range() twice in identical fashion, refactor and attempt to explain why we need to call unmap_mapping_range(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-07-10 14:10:53 -07:00
Grégoire Henry	b5aa8a0fc1	drm/i915: initialize fence registers to zero when loading GEM Unitialized fence register could leads to corrupted display. Problem encountered on MacBooks (revision 1 and 2), directly booting from EFI or through BIOS emulation. (bug #21710 at freedestop.org) Signed-off-by: Grégoire Henry <henry@pps.jussieu.fr> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-23 09:22:11 -07:00
Krzysztof Halasa	cfd43c025d	drm/i915: Fix size_t handling in off-by-default debug printfs Signed-off-by: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-22 20:19:19 -07:00
Eric Anholt	9a298b2acd	drm: Remove memory debugging infrastructure. It hasn't been used in ages, and having the user tell your how much memory is being freed at free time is a recipe for disaster even if it was ever used. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-18 13:00:33 -07:00
Chris Wilson	52dc7d32b8	drm/i915: Clear fence register on tiling stride change. The fence register value also depends upon the stride of the object, so we need to clear the fence if that is changed as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [anholt: Added 8xx and 965 paths, and renamed the confusing i915_gem_object_tiling_ok function to i915_gem_object_fence_offset_ok] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-18 12:40:50 -07:00
Chris Wilson	8c4b8c3f34	drm/i915: Install fence register for tiled scanout on i915 With the work by Jesse Barnes to eliminate allocation of fences during execbuffer, it becomes possible to write to the scan-out buffer with it never acquiring a fence (simply by only ever writing to the object using tiled GPU commands and never writing to it via the GTT). So for pre-i965 chipsets which require fenced access for tiled scan-out buffers, we need to obtain a fence register. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-18 11:47:04 -07:00
Chris Wilson	d78b47b9a5	drm/i915: detach/attach get/put pages symmetry After performing an operation over the page list for a buffer retrieved by i915_gem_object_get_pages() the pages need to be returned with i915_gem_object_put_pages(). This was not being observed for the phys objects which were thus leaking references to their backing pages. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> CC: Dave Airlie <airlied@gmail.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-17 15:26:24 -07:00
Chris Wilson	2939e1f533	drm/i915: NOMEM->NOSPC To differentiate between encountering an out-of-memory error with running out of space in the aperture, use ENOSPC for the later. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 14:51:39 -07:00
Chris Wilson	21d509e339	drm/i915: use I915_GEM_GPU_DOMAINS Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 14:51:18 -07:00
Chris Wilson	b1ce786cb8	drm/i915: no need to hold mutex for object lookup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 14:41:37 -07:00
Chris Wilson	5f26a2c7ad	drm/i915: OR in the COMMAND read domain for the batch buffer. The batch buffer may be shared with another read buffer, so we should not ignore any previously set domains, but just or in the command domain (and check that the buffer is not writable). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 13:54:05 -07:00
Chris Wilson	83d6079515	drm/i915: Sanity check execbuffer arguments before touching state. By sending a broken execbuffer (its length was not suitably aligned) I triggered an operation upon a freed object. The invalid alignment was discovered after updating the write_domain on the object but before the object was placed on the active queue. So during the unwind process following the error, the now freed object attempts to flush its non-existent, but outstanding, GPU writes causing this use-after-free. [drm:i915_dispatch_gem_execbuffer] ERROR alignment [drm:i915_gem_execbuffer] ERROR dispatch failed -22 WARNING: at lib/kref.c:43 warn_slowpath_null+0x10/0x15() Modules linked in: Pid: 4552, comm: lt-csi-drm Not tainted 2.6.30-rc6 #423 Call Trace: [<c0119ef3>] warn_slowpath_fmt+0x57/0x6d [<c014de24>] ? get_pageblock_migratetype+0x18/0x1e [<c014e8fd>] ? free_hot_page+0xa/0xc [<c014e915>] ? __free_pages+0x16/0x1f [<c0153ebf>] ? shmem_truncate_range+0x63e/0x656 [<c015fb2f>] ? slob_page_alloc+0x146/0x1c8 [<c0119f19>] warn_slowpath_null+0x10/0x15 [<c01f55f2>] kref_get+0x1b/0x21 [<c02605db>] i915_gem_object_move_to_active+0x1f/0x56 [<c0261302>] i915_add_request+0x156/0x19a [<c026136e>] i915_gem_object_flush_gpu_write_domain+0x28/0x3f [<c0261eca>] i915_gem_object_unbind+0x4a/0x124 [<c0261fd7>] i915_gem_free_object+0x33/0x9b [<c0250d6b>] drm_gem_object_free+0x28/0x4a [<c0250d43>] ? drm_gem_object_free+0x0/0x4a [<c01f55ce>] kref_put+0x38/0x41 [<c0250cbf>] drm_gem_object_unreference+0x11/0x13 [<c0250d06>] drm_gem_object_handle_unreference+0x1e/0x21 [<c0250d13>] drm_gem_object_release_handle+0xa/0xe [<c01f3e6b>] idr_for_each+0x5f/0x98 [<c0250d09>] ? drm_gem_object_release_handle+0x0/0xe [<c0250daf>] drm_gem_release+0x22/0x34 [<c025046f>] drm_release+0x1e8/0x3c4 [<c0162d25>] __fput+0xaf/0x146 [<c0162dce>] fput+0x12/0x14 [<c01605ef>] filp_close+0x48/0x52 [<c011b182>] put_files_struct+0x57/0x9b [<c011b1e4>] exit_files+0x1e/0x20 [<c011c6b6>] do_exit+0x16d/0x511 [<c03704ab>] ? __schedule+0x3d4/0x3e5 [<c0103f0d>] ? handle_irq+0xd/0x69 [<c011caa7>] do_group_exit+0x4d/0x73 [<c011cae0>] sys_exit_group+0x13/0x17 [<c010268c>] sysenter_do_call+0x12/0x2b Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 13:52:57 -07:00
Zhenyu Wang	036a4a7d92	drm/i915: handle interrupt on new chipset Update interrupt handling methods for IGDNG with new registers for display and graphics interrupt functions. As we won't use irq-based vblank sync in dri2, so display interrupt on new chip will be used for hotplug only in future. Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-09 11:15:27 -07:00
Eric Anholt	b962442e46	drm/i915: Change GEM throttling to be 20ms like the comment says. keithp didn't like the original 20ms plan because a cooperative client could be starved by an uncooperative client. There may even have been problems with cooperative clients versus cooperative clients. So keithp changed throttle to just wait for the second to last seqno emitted by that client. It worked well, until we started getting more round-trips to the server due to DRI2 -- the server throttles in BlockHandler, and so if you did more than one round trip after finishing your frame, you'd end up unintentionally syncing to the swap. Fix this by keeping track of the client's requests, so the client can wait when it has an outstanding request over 20ms old. This should have non-starving behavior, good behavior in the presence of restarts, and less waiting. Improves high-settings openarena performance on my GM45 by 50%. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-06-04 11:44:22 +00:00
Eric Anholt	0e7ddf7eee	drm/i915: Remove a bad BUG_ON in the fence management code. This could be triggered by a gtt mapping fault on 965 that decides to remove the fence from another object that happens to be active currently. Since the other object doesn't rely on the fence reg for its execution, we don't wait for it to finish. We'll soon be not waiting on 915 most of the time as well, so just drop the BUG_ON. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-06-04 11:43:09 +00:00
Kristian Høgsberg	07f4f3e8a2	i915: Set object to gtt domain when faulting it back in When a GEM object is evicted from the GTT we set it to the CPU domain, as it might get swapped in and out or ever mmapped regularly. If the object is mmapped through the GTT it can still get evicted in this way by other objects requiring GTT space. When the GTT mapping is touched again we fault it back into the GTT, but fail to set it back to the GTT domain. This means we fail to flush any cached CPU writes to the pages backing the object which will then happen "eventually", typically after we write to the page through the uncached GTT mapping. [anholt: Note that userland does do a set_domain(GTT, GTT) when starting to access the GTT mapping. That covers getting the existing mapping of the object synchronized if it's bound to the GTT. But set_domain(GTT, GTT) doesn't do anything if the object is currently unbound. This fix covers the transition to being bound for GTT mapping.] Fixes glyph and other pixmap corruption during swapping. fd.o bug #21790 Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-05-27 13:06:47 -07:00
Eric Anholt	cfa16a0de5	drm/i915: Apply a big hammer to 865 GEM object CPU cache flushing. On the 865, but not the 855, the clflush we do appears to not actually make it out to the hardware all the time. An easy way to safely reproduce was X -retro, which would show that some of the blits involved in drawing the lovely root weave didn't make it out to the hardware. Those blits are 32 bytes each, and 1-2 would be missing at various points around the screen. Other experimentation (doing more clflush, doing more AGP chipset flush, poking at some more device registers to maybe trigger more flushing) didn't help. krh came up with the wbinvd as a way to successfully get all those blits to appear. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-05-26 19:11:33 -07:00
Eric Anholt	e76a16deb8	drm/i915: Fix tiling pitch handling on 8xx. The pitch field is an exponent on pre-965, so we were rejecting buffers on 8xx that we shouldn't have. 915 got lucky in that the largest legal value happened to match (8KB / 512 = 0x10), but 8xx has a smaller tile width. Additionally, we programmed that bad value into the register on 8xx, so the only pitch that would work correctly was 4096 (512-1023 pixels), while others would probably give bad rendering or hangs. Signed-off-by: Eric Anholt <eric@anholt.net> fd.o bug #20473.	2009-05-26 19:11:31 -07:00
Jesse Barnes	14b6039158	i915: support 8xx desktop cursors For some reason we never added 8xx desktop cursor support to the kernel. This patch fixes that. [krh: Also set the size on pre-i915 hw.] Tested-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-05-22 12:31:14 -07:00
Jesse Barnes	8e7d2b2c6e	drm/i915: allocate large pointer arrays with vmalloc For awhile now, many of the GEM code paths have allocated page or object arrays with the slab allocator. This is nice and fast, but won't work well if memory is fragmented, since the slab allocator works with physically contiguous memory (i.e. order > 2 allocations are likely to fail fairly early after booting and doing some work). This patch works around the issue by falling back to vmalloc for >PAGE_SIZE allocations. This is ugly, but much less work than chaining a bunch of pages together by hand (suprisingly there's not a bunch of generic kernel helpers for this yet afaik). vmalloc space is somewhat precious on 32 bit kernels, but our allocations shouldn't be big enough to cause problems, though they're routinely more than a page. Note that this patch doesn't address the unchecked alloc-based-on-ioctl-args in GEM; that needs to be fixed in a separate patch. Also, I've deliberately ignored the DRM's "area" junk. I don't think anyone actually uses it anymore and I'm hoping it gets ripped out soon. [Updated: removed size arg to new free function. We could unify the free functions as well once the DRM mem tracking is ripped out.] fd.o bug #20152 (part 1/3) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-05-19 10:07:14 -07:00
Jesse Barnes	802c7eb646	drm/i915: sanity check IER at wait_request time We might sleep here anyway so I hope an extra uncached read is ok to add. In #20896 we found that vbetool clobbers the IER. In KMS mode this is particularly bad since we don't set the interrupt regs late (in EnterVT), so we'd fail to get any interrupts at all after X started (since some distros have scripts that call vbetool at X startup apparently). So this patch checks IER at wait_request time, and re-enables interrupts if it's been clobbered. In a proper config this check should never be triggered. This is really a distro issue, but having a sanity check is nice, as long as it doesn't have a real performance hit. Tested-by: Mateusz Kaduk <mateusz.kaduk@gmail.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> [anholt: Moved the check inside of the sleeping case to avoid perf cost] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-05-14 16:00:27 -07:00
Wu Fengguang	d816f6ac4f	drm/i915: fix unpaired i915 device mutex on entervt failure. Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-21 17:53:38 -07:00
Shaohua Li	68c8434217	drm/i915: fix scheduling while holding the new active list spinlock regression caused by commit `5e118f4139`: i915_gem_object_move_to_inactive() should be called in task context, as it calls fput(); Signed-off-by: Shaohua Li<shaohua.li@intel.com> [anholt: Add more detail to the comment about the lock break that's added] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-14 11:45:28 -07:00
Eric Anholt	280b713b5b	drm/i915: Allow tiling of objects with bit 17 swizzling by the CPU. Save the bit 17 state of the pages when freeing the page list, and reswizzle them if necessary when rebinding the pages (in case they were swapped out). Since we have userland with expectations that the swizzle enums let it pread and pwrite contents accurately, we can't expose a new swizzle enum for bit 17 (which it would have to GTT map to handle), so we handle it down in pread and pwrite by swizzling the copy when bit 17 of the page address is set. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-08 10:50:57 -07:00
Eric Anholt	e5e9ecde63	drm/i915: Correctly set the write flag for get_user_pages in pread. Otherwise, the results of our read didn't show up when we were faulting in the page being read into (as happened with a testcase reading into a big stack area). Likely accounts for some conformance test failures. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-08 10:50:56 -07:00
Florian Mickler	2bc43b5cf5	drm/i915: Fix use of uninitialized var in `40a5f0de` i915_gem_put_relocs_to_user returned an uninitialized value which got returned to userspace. This caused libdrm in my setup to never get out of a do{}while() loop retrying i915_gem_execbuffer. result was hanging X, overheating of cpu and 2-3gb of logfile-spam. This patch adresses the issue by 1. initializing vars in this file where necessary 2. correcting wrongly interpreted return values of copy_[from/to]_user Signed-off-by: Florian Mickler <florian@mickler.org> [anholt: cleanups of unnecessary changes, consistency in APIs] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-08 10:18:19 -07:00
Ben Gamari	6911a9b8ae	drm/i915: Implement batch and ring buffer dumping We create a debugfs node (i915_ringbuffer_data) to expose a hex dump of the ring buffer itself. We also expose another debugfs node (i915_ringbuffer_info) with information on the state (i.e. head, tail addresses) of the ringbuffer. For batchbuffer dumping, we look at the device's active_list, dumping each object which has I915_GEM_DOMAIN_COMMAND in its read domains. This is all exposed through the dri/i915_batchbuffers debugfs file with a header for each object (giving the objects gtt_offset so that it can be matched against the offset given in the BATCH_BUFFER_START command. Signed-off-by: Ben Gamari <bgamari@gmail.com> Signed-off-by: Carl Worth <cworth@cworth.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-08 10:18:06 -07:00
Carl Worth	5e118f4139	drm/i915: Add a spinlock to protect the active_list This is a baby-step in the direction of having finer-grained locking than the struct_mutex. Specifically, this will enable new debugging code to read the active list for printing out GPU state when the GPU is wedged, (while the struct_mutex is held, of course). Signed-off-by: Carl Worth <cworth@cworth.org> [anholt: indentation fix] Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-01 15:22:07 -07:00
Jesse Barnes	959b887cf4	drm/i915: check for -EINVAL from vm_insert_pfn Indicates something is wrong with the mapping; and apparently triggers in current kernels. Signed-off-by: Jesse Barnes <jbarnes@virtuosugeek.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-01 11:07:49 -07:00
Daniel Vetter	8d7773a32d	drm/i915: fix up tiling/fence reg setup on i8xx class hw This fixes all the tiling problems with the 2d ddx. glxgears still doesn't work. Changes: - fix a copy&paste error in i8xx fence reg setup. It resulted in an at most a 512KB offset of the fence reg window, so was only visible sometimes. - add tests for stride and object size constrains (also for i915 and 1965 class hw). Userspace seems to have an of-by-one bug there, which changes the fence size by at most 512KB due to an overflow. - because i8xx hw is quite old (and therefore not as well-tested) I left 2 debug WARN_ONs in the i8xx fence reg setup code to hopefully catch any further overflows in the bit-fields. Lastly there's one small change to make the alignment checks more consistent. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=20289 Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-04-01 11:06:47 -07:00
Dave Airlie	d008877550	drm/i915: check the return value from the copy from user This produced a warning on my build, not sure why super-warning-man didn't notice this one, its much worse than the %z one. Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-03-28 20:29:48 -04:00
Dave Airlie	90f959bcb3	drm: merge Linux master into HEAD Conflicts: drivers/gpu/drm/drm_info.c drivers/gpu/drm/drm_proc.c drivers/gpu/drm/i915/i915_gem_debugfs.c	2009-03-28 20:22:18 -04:00
Owain G. Ainsworth	ad086c833d	i915/drm: Remove two redundant agp_chipset_flushes agp_chipset_flush() is for flushing the intel GMCH write cache via the IFP, these two uses are for when we're getting the object into the cpu READ domain, and thus should not be needed. This confused me when I was getting my head around the code. With thanks to airlied for helping me check my mental picture of how the flushes and clflushes are supposed to be used. Signed-off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-27 15:12:07 -07:00
Eric Anholt	40a5f0decd	drm/i915: Fix lock order reversal in GEM relocation entry copying. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Keith Packard <keithp@keithp.com>	2009-03-27 14:47:55 -07:00
Eric Anholt	201361a54e	drm/i915: Fix lock order reversal with cliprects and cmdbuf in non-DRI2 paths. This introduces allocation in the batch submission path that wasn't there previously, but these are compatibility paths so we care about simplicity more than performance. kernel.org bug #12419. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Keith Packard <keithp@keithp.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-03-27 14:47:34 -07:00
Eric Anholt	eb01459fbb	drm/i915: Fix lock order reversal in shmem pread path. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-03-27 14:47:21 -07:00
Eric Anholt	40123c1f8d	drm/i915: Fix lock order reversal in shmem pwrite path. Like the GTT pwrite path fix, this uses an optimistic path and a fallback to get_user_pages. Note that this means we have to stop using vfs_write and roll it ourselves. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-03-27 14:47:13 -07:00
Eric Anholt	856fa1988e	drm/i915: Make GEM object's page lists refcounted instead of get/free. We've wanted this for a few consumers that touch the pages directly (such as the following commit), which have been doing the refcounting outside of get/put pages. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-03-27 14:46:28 -07:00
Eric Anholt	3de09aa3b3	drm/i915: Fix lock order reversal in GTT pwrite path. Since the pagefault path determines that the lock order we use has to be mmap_sem -> struct_mutex, we can't allow page faults to occur while the struct_mutex is held. To fix this in pwrite, we first try optimistically to see if we can copy from user without faulting. If it fails, fall back to using get_user_pages to pin the user's memory, and map those pages atomically when copying it to the GPU. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>	2009-03-27 14:45:52 -07:00
Owain G. Ainsworth	995e37cafb	i915/drm: Remove two redundant agp_chipset_flushes agp_chipset_flush() is for flushing the intel GMCH write cache via the IFP, these two uses are for when we're getting the object into the cpu READ domain, and thus should not be needed. This confused me when I was getting my head around the code. With thanks to airlied for helping me check my mental picture of how the flushes and clflushes are supposed to be used. Signed-off-by: Owain G. Ainsworth <oga@openbsd.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-03-13 14:24:09 +10:00
Benjamin Herrenschmidt	f77d390c97	drm: Split drm_map and drm_local_map Once upon a time, the DRM made the distinction between the drm_map data structure exchanged with user space and the drm_local_map used in the kernel. For some reasons, while the BSD port still has that "feature", the linux part abused drm_map for kernel internal usage as the local map only existed as a typedef of the struct drm_map. This patch fixes it by declaring struct drm_local_map separately (though its content is currently identical to the userspace variant), and changing the kernel code to only use that, except when it's a user<->kernel interface (ie. ioctl). This allows subsequent changes to the in-kernel format I've also replaced the use of drm_local_map_t with struct drm_local_map in a couple of places. Mostly by accident but they are the same (the former is a typedef of the later) and I have some remote plans and half finished patch to completely kill the drm_local_map_t typedef so I left those bits in. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-03-13 14:23:56 +10:00
Eric Anholt	dc529a4fe1	drm/i915: fix 945 fence register writes for fence 8 and above. The last 8 fence registers sit at a different offset, so when we went to set fence number 8 in the lower offset, we instead set PGETBL_CTL, and the GPU got all sorts of angry at us. fd.o bug #20567. Easily reproducible by running glxgears and killing it about 6 times. Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-11 11:02:06 -07:00
Chris Wilson	d7619c4b9c	drm/i915: Protect active fences on i915 The i915 also uses the fence registers for GPU access to tiled buffers so we cannot reallocate one whilst it is on the active list. By performing a LRU scan of the fenced buffers we also avoid waiting the possibility of waiting on a pinned, or otherwise unusable, buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-11 10:33:28 -07:00
Chris Wilson	fc7170ba28	drm/i915: Check to see if we've pinned all available fences We need to check and report if there are no available fences - or else we spin endlessly waiting for a buffer to magically unpin itself. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-10 20:32:13 -07:00
Chris Wilson	22c344e9a0	drm/i915: Check fence status on every pin. As we may steal the fence register of an unpinned buffer for another, every time we repin the buffer we need to recheck whether it needs to be allocated a fence. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-10 20:25:32 -07:00
Chris Wilson	9b2412f9ad	drm/i915: First recheck for an empty fence register. If we wait upon a request and successfully unbind a buffer occupying a fence register, then that slot will be freed and cause a NULL derefrence upon rescanning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-10 20:22:24 -07:00
Kyle McMartin	0fce81e3cc	i915: add newline to i915_gem_object_pin failure msg Prevents formatting nasty as below: [drm:i915_gem_object_pin] ERROR Failure to bind: -12<3>[drm:i915_gem_evict_something] ERROR inactive empty 1 request empty 1 flushing empty 1 Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-10 13:11:11 -07:00
Kristian Høgsberg	b70d11da61	drm: Return EINVAL on duplicate objects in execbuffer object list If userspace passes an object list with the same object appearing more than once, we end up hitting the BUG_ON() in i915_gem_object_set_to_gpu_domain() as it gets called a second time for the same object. Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net>	2009-03-10 13:11:11 -07:00
Dave Airlie	e08fb4f6d1	drm/i915: convert DRM_ERROR to DRM_DEBUG in phys object pwrite path This snuck in when I wrote phys object support. Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-25 14:52:30 +10:00
Karsten Wiese	6c0594a306	Fix an oops in i915_gem_retire_requests() dev_priv->hw_status_page can be NULL, if i915_gem_retire_requests() is called from i915_gem_busy_ioctl(). Signed-off-by Karsten Wiese <fzu@wemgehoertderstaat.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-23 17:14:47 -08:00
Chris Wilson	bab2d1f653	drm/i915: Fix regression in 95ca9d The object is dereferenced before the NULL check. Oops. Fixes http://bugs.freedesktop.org/show_bug.cgi?id=20235 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-23 10:06:30 +10:00
Eric Anholt	f21289b355	drm/i915: Retire requests from i915_gem_busy_ioctl. This ensures that the user gets the latest information from the hardware on whether the buffer is busy, potentially reducing the working set of objects that the user chooses. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-23 10:06:26 +10:00
Jesse Barnes	5669fcacc5	drm/i915: suspend/resume GEM when KMS is active In the KMS case, we need to suspend/resume GEM as well. So on suspend, make sure we idle GEM and stop any new rendering from coming in, and on resume, re-init the framebuffer and clear the suspended flag. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-23 10:06:23 +10:00
Eric Anholt	efbeed96f7	drm/i915: Don't let a device flush to prepare buffers clear new write_domains. The problem was that object_set_to_gpu_domain would set the new write_domains that are getting set by this batchbuffer, then the accumulated flushes required for all the objects in preparation for this batchbuffer were posted, and the brand new write domain would get cleared by the flush being posted. Instead, hang on to the new (or old if we're not changing it) value and set it after the flush is queued. Results from this noticably included conformance test failures from reads shortly after writes (where the new write domain had been lost and thus not flushed and waited on), but is a suspected cause of hangs in some apps when a write domain is lost on a buffer that gets reused for instruction or commmand state. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-23 10:06:19 +10:00
Eric Anholt	8b0e378a20	drm/i915: Cut two args to set_to_gpu_domain that confused this tricky path. While not strictly required, it helped while thinking about the following change. This change should be invariant. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-23 10:06:15 +10:00
Jesse Barnes	ab00b3e521	drm/i915: Keep refs on the object over the lifetime of vmas for GTT mmap. This fixes potential fault at fault time if the object was unreferenced while the mapping still existed. Now, while the mmap_offset only lives for the lifetime of the object, the object also stays alive while a vma exists that needs it. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-02-20 12:21:13 +10:00
Chris Wilson	85a7bb9858	drm/i915: Cleanup the hws on ringbuffer constrution failure. If we fail to create the ringbuffer, then we need to cleanup the allocated hws. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:12 +10:00
Chris Wilson	3eb2ee77b0	drm/i915: Unpin the hws if we fail to kmap. A missing unpin on the error path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:11 +10:00
Chris Wilson	47ed185a77	drm/i915: Unpin the ringbuffer if we fail to ioremap it. A missing unpin on the error path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:11 +10:00
Chris Wilson	491152b877	drm/i915: unpin for an invalid memory domain. A missing unreference and unpin after rejecting the relocation for an invalid memory domain. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:11 +10:00
Chris Wilson	13af106276	drm/i915: Release and unlock on mmap_gtt error path. We failed to unlock the mutex after failing to create the mmap offset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:11 +10:00
Roland Dreier	a35f2e2b83	drm/i915: Fix potential AB-BA deadlock in i915_gem_execbuffer() Lockdep warns that i915_gem_execbuffer() can trigger a page fault (which takes mmap_sem) while holding dev->struct_mutex, while drm_vm_open() (which is called with mmap_sem already held) takes dev->struct_mutex. So this is a potential AB-BA deadlock. The way that i915_gem_execbuffer() triggers a page fault is by doing copy_to_user() when returning new buffer offsets back to userspace; however there is no reason to hold the struct_mutex when doing this copy, since what is being copied is the contents of an array private to i915_gem_execbuffer() anyway. So we can fix the potential deadlock (and get rid of the lockdep warning) by simply moving the copy_to_user() outside of where struct_mutex is held. This fixes <http://bugzilla.kernel.org/show_bug.cgi?id=12491>. Reported-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:10 +10:00
Chris Wilson	96dec61d56	drm/i915: refleak along pin() error path. A missing unreference if the user calls pin() a second time on a pinned buffer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:09 +10:00
Chris Wilson	a198bc80ae	drm/i915: Cleanup trivial leak on execbuffer error path. Also spotted by Owain Ainsworth. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-20 12:21:09 +10:00
Linus Torvalds	f06da264cf	i915: Fix more size_t format string warnings The DRI people seem to have a hard time getting these right (see also commit `aeb565dfc3`). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-09 08:57:29 -08:00
Chris Wilson	7d8d58b23f	drm/i915: Unlock mutex on i915_gem_fault() error path If we failed to allocate a new fence register we would return VM_FAULT_SIGBUS without relinquishing the lock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-08 21:38:27 +10:00
Jesse Barnes	0f973f2788	drm/i915: add fence register management to execbuf Adds code to set up fence registers at execbuf time on pre-965 chips as necessary. Also fixes up a few bugs in the pre-965 tile register support (get_order != ffs). The number of fences available to the kernel defaults to the hw limit minus 3 (for legacy X front/back/depth), but a new parameter allows userspace to override that as needed. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-08 21:38:02 +10:00
Eric Anholt	d9ddcb96e0	drm/i915: Return error from i915_gem_object_get_fence_reg() when failing. Previously, the caller would continue along without knowing that the function failed, resulting in potential mis-rendering. Right now vm_fault just returns SIGBUS in that case, and we may need to disable signal handling to avoid that happening. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-08 21:37:56 +10:00
Eric Anholt	ab657db12d	drm/i915: Set up an MTRR covering the GTT at driver load. We'd love to just be using PAT, but even on chips with PAT it gets disabled sometimes due to an errata. It would probably be better to have pat_enabled exported and only bother with this when !pat_enabled. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-08 21:37:50 +10:00
Eric Anholt	e806b49574	drm/i915: Suppress GEM teardown on X Server exit in KMS mode. Fixes hangs when starting X for the second time. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-02-08 21:37:41 +10:00
Linus Torvalds	832fb4a01c	Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm/i915: Fix cursor physical address choice to match the 2D driver. drm: stash AGP include under the do-we-have-AGP ifdef drm: don't whine about not reading EDID data drm/i915: hook up LVDS DPMS property drm/i915: remove unnecessary debug output in KMS init i915: fix freeing path for gem phys objects. drm: create mode_config idr lock drm: fix leak of device mappings since multi-master changes.	2009-01-26 10:16:11 -08:00
Linus Torvalds	aeb565dfc3	Fix annoying DRM_ERROR() string warning Use '%zu' to print out a size_t variable, not '%d'. Another case of the "let's keep at least Linus' defconfig compile warningless" rule. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-26 10:01:53 -08:00
Dave Airlie	260883c856	i915: fix freeing path for gem phys objects. This off-by-one was pointed out by Jesse Barnes. Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-01-22 17:58:49 +10:00
Dave Airlie	71acb5eb8d	drm/i915: add support for physical memory objects This is an initial patch to do support for objects which needs physical contiguous main ram, cursors and overlay registers on older chipsets. These objects are bound on cursor bin, like pinning, and we copy the data to/from the backing store object into the real one on attach/detach. notes: possible over the top in attach/detach operations. no overlay support yet. Signed-off-by: Dave Airlie <airlied@redhat.com>	2009-01-16 18:45:06 +10:00
Harvey Harrison	9b4778f680	trivial: replace last usages of __FUNCTION__ in kernel __FUNCTION__ is gcc-specific, use __func__ Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-07 15:48:54 -08:00
Eric Anholt	9bb2d6f94a	drm/i915: Don't allow objects to get bound while VT switched. This avoids a BUG_ON in the enter_vt path due to objects being in the GTT when we shouldn't have ever let them be (as we're not supposed to touch the device during that time). This was triggered by a change in the 2D driver to use the GTT mapping of objects after pinning them to improve software fallback performance. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2009-01-07 11:49:39 +10:00
Julia Lawall	aad87dff5a	drm/i915: Remove redundant test in error path. The error path for object list being null is in the second goto target. Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-12-29 17:47:27 +10:00
Eric Anholt	f1acec9338	drm/i915: Don't print to dmesg when taking signal during object_pin. This showed up in logs where people had a hung chip, so pinning was blocked on the chip unpinning other buffers, and the X Server took its scheduler signal during that time. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-12-29 17:47:27 +10:00
Eric Anholt	b117763627	drm/i915: Don't double-unpin buffers if we take a signal in evict_everything(). We haven't seen this in practice, but it was visible when looking at a bug report from when i915_gem_evict_everything() was broken and would always return error. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-12-29 17:47:25 +10:00
Jesse Barnes	79e539453b	DRM: i915: add mode setting support This commit adds i915 driver support for the DRM mode setting APIs. Currently, VGA, LVDS, SDVO DVI & VGA, TV and DVO LVDS outputs are supported. HDMI, DisplayPort and additional SDVO output support will follow. Support for the mode setting code is controlled by the new 'modeset' module option. A new config option, CONFIG_DRM_I915_KMS controls the default behavior, and whether a PCI ID list is built into the module for use by user level module utilities. Note that if mode setting is enabled, user level drivers that access display registers directly or that don't use the kernel graphics memory manager will likely corrupt kernel graphics memory, disrupt output configuration (possibly leading to hangs and/or blank displays), and prevent panic/oops messages from appearing. So use caution when enabling this code; be sure your user level code supports the new interfaces. A new SysRq key, 'g', provides emergency support for switching back to the kernel's framebuffer console; which is useful for testing. Co-authors: Dave Airlie <airlied@linux.ie>, Hong Liu <hong.liu@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-29 17:47:23 +10:00
Jesse Barnes	de151cf67c	drm/i915: add GEM GTT mapping support Use the new core GEM object mapping code to allow GTT mapping of GEM objects on i915. The fault handler will make sure a fence register is allocated too, if the object in question is tiled. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-29 17:47:23 +10:00
Eric Anholt	c4de0a5d67	drm/i915: Don't return busy for buffers left on the flushing list. These buffers don't have active rendering still occurring to them, they just need either a flush to be emitted or a retire_requests to occur so that we notice they're done. Return unbusy so that one of the two occurs. The two expected consumers of this interface (OpenGL and libdrm_intel BO cache) both want this behavior. Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Keith Packard <keithp@keithp.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-19 15:34:32 +10:00
Owain Ainsworth	15c35334c9	drm/i915: Don't return error in evict_everything when we get to the end. Returning -ENOMEM errored all the way out of execbuf, so the rendering never occurred. Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-09 15:37:17 +10:00
Eric Anholt	0235439232	drm/i915: Return error in i915_gem_set_to_gtt_domain if we're not in the GTT. It's only for flushing caches appropriately for GTT access, not for actually getting it there. Prevents potential smashing of cpu read/write domains on unbound objects. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:24:47 +10:00
Keith Packard	ac94a962b2	drm/i915: Retry execbuffer pinning after clearing the GTT If we fail to pin all of the buffers in an execbuffer request, go through and clear the GTT and try again to see if its just a matter of fragmentation Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:22:06 +10:00
Keith Packard	646f0f6e43	drm/i915: Move the execbuffer domain computations together This eliminates the dev_set_domain function and just in-lines it where its used, with the goal of moving the manipulation and use of invalidate_domains and flush_domains closer together. This also avoids calling add_request unless some domain has been flushed. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:22:02 +10:00
Keith Packard	c0d9082928	drm/i915: Rename object_set_domain to object_set_to_gpu_domain Now that the CPU and GTT domain operations are isolated to their own functions, the previously general-purpose set_domain function is now used only to set GPU domains. It also has no failure cases, which is important as this eliminates any possible interruption of the computation of new object domains and subsequent emmission of the flushing instructions into the ring. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:21:58 +10:00
Eric Anholt	e47c68e9c5	drm/i915: Make a single set-to-cpu-domain path and use it wherever needed. This fixes several domain management bugs, including potential lack of cache invalidation for pread, potential failure to wait for set_domain(CPU, 0), and more, along with producing more intelligible code. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:21:55 +10:00
Eric Anholt	2ef7eeaa55	drm/i915: Make a single set-to-gtt-domain path. This fixes failure to flush caches in the relocation update path, and failure to wait in the set_domain ioctl, each of which could lead to incorrect rendering. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:21:52 +10:00
Eric Anholt	b670d81582	drm/i915: If interrupted while setting object domains, still emit the flush. Otherwise, we would leave the objects in an inconsistent state, such as write_domain == 0 but on the flushing list. Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:21:48 +10:00
Eric Anholt	ce44b0ea3d	drm/i915: Move flushing list cleanup from flush request retire to request emit. obj_priv->write_domain is "write domain if the GPU went idle now", not "write domain at this moment." By postponing the clear, we confused the concept, required more storage, and potentially emitted more flushes than are required. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-04 11:21:45 +10:00
Eric Anholt	151903d546	drm/i915: Fix copy'n'pasteo that broke VT switch if flushing was non-empty. Introduced in the "Avoid BUG_ONs on VT switch" commit. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-12-01 10:23:21 +10:00
Keith Packard	6133047aa6	drm/i915: execbuffer pins objects, no need to ensure they're still in the GTT Before we had the notion of pinning objects, we had a kludge around to make sure all of the objects were still resident in the GTT before we committed to executing a batch buffer. We don't need this any longer, and it sticks an error return in the middle of object domain computations that must be associated with a subsequent flush/invalidate emmission into the ring. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-11-25 09:28:43 +10:00
Keith Packard	2678d9d696	drm/i915: Subtract total pinned bytes from available aperture size The old code was wandering through the active list looking for pinned buffers; there may be other pinned buffers around. Fortunately, we keep a count of the total amount of pinned memory and can use that instead. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-11-25 09:28:06 +10:00
Eric Anholt	28dfe52a6e	drm/i915: Avoid BUG_ONs on VT switch with a wedged chipset. Instead, just warn that bad things are happening and do our best to clean up the mess without the GPU's help. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-11-25 09:27:52 +10:00
Owen Taylor	6a47baa6ce	i915: Don't attempt to short-circuit object_wait_rendering by checking domains. This could return early when reading after writing a buffer, if somebody had already put it on the flushing list (write domains are 0, but still active), leading to glReadPixels failure. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@linux.ie>	2008-11-11 17:43:26 +10:00
Linus Torvalds	da4a22cba7	Merge branch 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'io-mappings-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: io mapping: clean up #ifdefs io mapping: improve documentation i915: use io-mapping interfaces instead of a variety of mapping kludges resources: add io-mapping functions to dynamically map large device apertures x86: add iomap_atomic*()/iounmap_atomic() on 32-bit using fixmaps	2008-11-03 10:15:40 -08:00
Eric Anholt	5a125c3c79	i915: Add GEM ioctl to get available aperture size. This will let userland know when to submit its batchbuffers, before they get too big to fit in the aperture. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-11-03 10:56:49 +10:00
Keith Packard	0839ccb8ac	i915: use io-mapping interfaces instead of a variety of mapping kludges Impact: optimize/clean-up the IO mapping implementation of the i915 DRM driver Switch the i915 device aperture mapping to the io-mapping interface, taking advantage of the cleaner API to extend it across all of the mapping uses, including both pwrite and relocation updates. This dramatically improves performance on 64-bit kernels which were using the same slow path as 32-bit non-HIGHMEM kernels prior to this patch. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2008-10-31 10:12:40 +01:00
Linus Torvalds	70740d6c93	Merge branch 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm: Avoid oops in DRM_IOCTL_RM_DRAW if a bad handle is supplied. drm: Add 32-bit compatibility for DRM_IOCTL_UPDATE_DRAW. drm/i915: use pipes, not planes to label vblank data drm/i915: hold dev->struct_mutex and DRM lock during vblank ring operations i915: Fix format string warnings on x86-64. i915: Don't dereference HWS in /proc debug files when it isn't initialized. i915: Enable IMR passthrough of vblank events before enabling it in pipestat. drm: Remove two leaks of vblank reference count in error paths. drm: fix leak of cliprects in drm_rmdraw() i915: Disable MSI on GM965 (errata says it doesn't work) drm: Set cliprects to NULL when changing drawable to having 0 cliprects. i915: Protect vblank IRQ reg access with spinlock	2008-10-23 10:18:40 -07:00
Keith Packard	9e44af790f	drm/i915: hold dev->struct_mutex and DRM lock during vblank ring operations To synchronize clip lists with the X server, the DRM lock must be held while looking at drawable clip lists. To synchronize with other ring access, the ring mutex must be held while inserting commands into the ring. Failure to do the first resulted in easy visual corruption when moving windows, and the second could have corrupted the ring with DRI2. Grabbing the DRM lock involves using the DRM tasklet mechanism, grabbing the ring mutex means potentially sleeping. Deal with both of these by always running the tasklet from a work handler. Also, protect from clip list changes since the vblank request was queued by making sure the window has at least one rectangle while looking inside, preventing oopses . Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-23 13:45:56 +10:00
Thomas Gleixner	e8848a170f	fix CONFIG_HIGHMEM compile error in drivers/gpu/drm/i915/i915_gem.c commit `9b7530cc32` ("i915: cleanup coding horrors in i915_gem_gtt_pwrite()") broke the i386 build for CONFIG_HIGHMEM=y. Caught by automatic testing http://www.tglx.de/autoqa-logs/000137-0006-0001.log Signed-off-by: Thomas Gleixner <tglx@linutronix.de> [ My bad. It's the same patch I sent out earlier, nobody noticed then either.. ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 16:15:14 -07:00
Linus Torvalds	9b7530cc32	i915: cleanup coding horrors in i915_gem_gtt_pwrite() Yes, this will probably be switched over to a cleaner model anyway, but in the meantime I don't want to see the 'unused variable' warnings that come from the disgusting #ifdef code. Make the special case be a nice inlien function of its own, clean up the code, and make the warning go away. I wish people didn't write code that gets (valid) warnings from the compiler, but I'll limit my fixes to code that I actually care about (in this case just because I see the warning and it annoys me). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-20 14:16:43 -07:00
Keith Packard	6dbe2772d6	i915: Don't run retire work handler while suspended At leavevt and lastclose time, cancel any pending retire work handler invocation, and keep the retire work handler from requeuing itself if it is currently running. This patch restructures i915_gem_idle to perform all of these tasks instead of having both leavevt and lastclose call a sequence of functions. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:53 +10:00
Keith Packard	ba1eb1d825	i915: Map status page cached for chips with GTT-based HWS location. This should improve performance by avoiding uncached reads by the CPU (the point of having a status page), and may improve stability. This patch only affects G33, GM45 and G45 chips as those are the only ones using GTT-based HWS mappings. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:53 +10:00
Keith Packard	50aa253d82	i915: Fix up ring initialization to cover G45 oddities G45 appears quite sensitive to ring initialization register writes, sometimes leaving the HEAD register with the START register contents. Check to make sure HEAD is reset correctly when START is written, and fix it up, screaming loudly. Signed-off-by: Keith Packard <keithp@keithp.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:53 +10:00
Dave Airlie	e7d22bc3cb	i915: add missing return in error path. Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:52 +10:00
Eric Anholt	3043c60c48	drm: Clean up many sparse warnings in i915. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:52 +10:00
Eric Anholt	bd88ee4c1b	drm: Use ioremap_wc in i915_driver instead of ioremap, since we always want WC. Fixes failure to map the ringbuffer when PAT tells us we don't get to do uncached on something that's already mapped WC, or something along those lines. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:52 +10:00
Eric Anholt	4f481ed22e	drm: Avoid oops in GEM execbuffers with bad arguments. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:52 +10:00
Kristian Høgsberg	dbb19d302b	i915 gem: install and uninstall irq handler in entervt and leavevt ioctls. Signed-off-by: Kristian Høgsberg <krh@redhat.com> Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:52 +10:00
Eric Anholt	546b0974c3	i915: Use struct_mutex to protect ring in GEM mode. In the conversion for GEM, we had stopped using the hardware lock to protect ring usage, since it was all internal to the DRM now. However, some paths weren't converted to using struct_mutex to prevent multiple threads from concurrently working on the ring, in particular between the vblank swap handler and ioctls. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:51 +10:00
Eric Anholt	673a394b1e	drm: Add GEM ("graphics execution manager") to i915 driver. GEM allows the creation of persistent buffer objects accessible by the graphics device through new ioctls for managing execution of commands on the device. The userland API is almost entirely driver-specific to ensure that any driver building on this model can easily map the interface to individual driver requirements. GEM is used by the 2d driver for managing its internal state allocations and will be used for pixmap storage to reduce memory consumption and enable zero-copy GLX_EXT_texture_from_pixmap, and in the 3d driver is used to enable GL_EXT_framebuffer_object and GL_ARB_pixel_buffer_object. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Dave Airlie <airlied@redhat.com>	2008-10-18 07:10:12 +10:00

... 29 30 31 32 33 ...

2029 Commits