The bulk of the change is to convert the growing list of rings into an
array so that the relationship between the rings and the semaphore sync
registers can be easily computed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Having seen the effects of erroneous fencing on the batchbuffer, a
useful sanity check is to record the fence registers at the time of an
error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When trying to diagnose mysterious errors on resume, capture the
display register contents as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The pinned buffers are useful for diagnosing errors in setting up state
for the chipset, which may not necessarily be 'active' at the time of
the error, e.g. the cursor buffer object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We only ever used the PRB0, neglecting the secondary ring buffers, and
now with the advent of multiple engines with separate ring buffers we
need to excise the anachronisms from our code (and be explicit about
which ring we mean where). This is doubly important in light of the
FORCEWAKE required to read ring buffer registers on SandyBridge.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This holds error state from the main graphics arbiter mainly involving
the DMA engine and address translation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Currently, we believe the GPU is idle if just the RENDER ring is idle.
This is obviously wrong if we only using either the BLT or the BSD
rings and so masking genuine hangs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Preparing the ringbuffer for adding new commands can fail (a timeout
whilst waiting for the GPU to catch up and free some space). So check
for any potential error before overwriting HEAD with new commands, and
propagate that error back to the user where possible.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The ringbuffer keeps a pointer to the parent device, so we can use that
instead of passing around the pointer on the stack.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (476 commits)
vmwgfx: Implement a proper GMR eviction mechanism
drm/radeon/kms: fix r6xx/7xx 1D tiling CS checker v2
drm/radeon/kms: properly compute group_size on 6xx/7xx
drm/radeon/kms: fix 2D tile height alignment in the r600 CS checker
drm/radeon/kms/evergreen: set the clear state to the blit state
drm/radeon/kms: don't poll dac load detect.
gpu: Add Intel GMA500(Poulsbo) Stub Driver
drm/radeon/kms: MC vram map needs to be >= pci aperture size
drm/radeon/kms: implement display watermark support for evergreen
drm/radeon/kms/evergreen: add some additional safe regs v2
drm/radeon/r600: fix tiling issues in CS checker.
drm/i915: Move gpu_write_list to per-ring
drm/i915: Invalidate the to-ring, flush the old-ring when updating domains
drm/i915/ringbuffer: Write the value passed in to the tail register
agp/intel: Restore valid PTE bit for Sandybridge after bdd3072
drm/i915: Fix flushing regression from 9af90d19f
drm/i915/sdvo: Remove unused encoding member
i915: enable AVI infoframe for intel_hdmi.c [v4]
drm/i915: Fix current fb blocking for page flip
drm/i915: IS_IRONLAKE is synonymous with gen == 5
...
Fix up conflicts in
- drivers/gpu/drm/i915/{i915_gem.c, i915/intel_overlay.c}: due to the
new simplified stack-based kmap_atomic() interface
- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c: added .llseek entry due to BKL
removal cleanups.
Keep the current interface but ignore the KM_type and use a stack based
approach.
The advantage is that we get rid of crappy code like:
#define __KM_PTE \
(in_nmi() ? KM_NMI_PTE : \
in_irq() ? KM_IRQ_PTE : \
KM_PTE0)
and in general can stop worrying about what context we're in and what kmap
slots might be appropriate for that.
The downside is that FRV kmap_atomic() gets more expensive.
For now we use a CPP trick suggested by Andrew:
#define kmap_atomic(page, args...) __kmap_atomic(page)
to avoid having to touch all kmap_atomic() users in a single patch.
[ not compiled on:
- mn10300: the arch doesn't actually build with highmem to begin with ]
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix up drivers/gpu/drm/i915/intel_overlay.c]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@linux.ie>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Based on an original patch by Zhenyu Wang, this initializes the BLT ring for
SandyBridge and enables support for user execbuffers.
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
To handle retirements, we need per-ring tracking of active objects.
To handle evictions, we need global tracking of active objects.
As we enable more rings, rebuilding the global list from the individual
per-ring lists quickly grows tiresome and overly complicated. Tracking the
active objects in two lists is the lesser of two evils.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On Sandybridge, the bit definition for hotplug on SDE has changed, so
update the code to new definition.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30378
Cc: stable@kernel.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Owain Ainsworth reported an issue between the interaction of the
hangcheck and userspace immediately (and permanently) falling back to
s/w rasterisation. In order to break the mutex and begin resetting the
GPU, we must abort the current operation (usually within the wait) and
climb sufficiently far back up the call chain to drop the mutex. In his
implementation, Owain has a loop within the ioctl handler to detect the
hang and then sleep until the error handler has run. I've chosen to
return to userspace and report an EAGAIN which should trigger the
userspace ioctl handler to repeat the call (simply because it felt less
invasive...). Before hitting a wedged GPU, we then wait upon completion
of the error handler.
Reported-by: Owain G. Ainsworth <zerooa@googlemail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Avoid cause latencies in other clients by not taking the global struct
mutex and moving the per-client request manipulation a local per-client
mutex. For example, this allows a compositor to schedule a page-flip
(through X) whilst an OpenGL application is monopolising the GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This ring buffer is used for video decoding/encoding on Sandybridge.
Signed-off-by: Xiang, Haihao <haihao.xiang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Previously we only tidied up the active bo lists for chipsets were we
would attempt to reset the GPU. However, this action is necessary for
the system to continue and reclaim the dead bo for all chipsets.
Pointed out, in passing, by Owain Ainsworth.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Ironlake's graphics reset register has to be accessed via the MCHBAR,
rather than via PCI config space, which requires some refactoring.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The graphics domains are listed as GRDOM in the documentation, and the
GDRST PCI config register (0xc0) is only valid on I965 and GM45. Newer
chips (like Sandy Bridge) have a different GDRST.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
By reducing the hangcheck frequency we check less often, conserving
resources, and still detect a lock up quickly. On a fast machine with a
slow GPU (like a Core2 paired with a 945G) it is easy for the hangcheck to
misfire as we check too fast.
Also once hung and if we fail to completely reset the chip, we have a
nasty habit of proclaming a hang many times a second and generating a
strobe-like display.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The purpose is to make the code much easier to read and therefore reduce
the possibility for bugs.
A side effect is that it also makes it much easier for the compiler,
reducing the object size by 4k -- from just a few functions!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[Patch is slightly larger than is strictly necessary to fixup
surrounding checkpatch.pl errors.]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
If we are busy, then we may have woken up the wait_request handler but
not yet serviced it before the hang check fires. So in hang check,
double check that the i915_gem_do_wait_request() is still pending the
wake-up before declaring all hope lost.
Fixes regression with e78d73b16b.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30073
Reported-and-tested-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The GPU records whether it is currently waiting for a completion of a
WAIT_FOR_EVENT in the RB_WAIT bit in the ringbuffer control registers.
On third generation chipsets and later, a write of 1 to this bit breaks
the hang and returns the GPU to arbitration, i.e. the GPU should
continue executing the reminder of the batchbuffer and return to normal
operations.
By adding this to hangcheck we can avoid a full GPU reset under these
conditions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Alexander reported that the compilation of intel_overlay.c was failing
due to an inclusion that was only valid with CONFIG_DEBUG_FS. As the
whole error reporting is only useful with debugfs enabled, remove all
the redundant error state collection code when compiling without
CONFIG_DEBUG_FS.
Reported-by: Alexander Lam <lambchop468@gmail.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fix a minor confusion between intel_page_flip_finish(pipe) and
intel_page_flip_finish_plane(plane) -- should have no effect as
currently we map pipe 0 to plane 0 (and pipe 1 to plane 1).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It's part of the generic Intel driver infrastructure so rename it in
prepreparation for using it for VBT.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
When we miss the flip prepare interrupt, we never get into the
software state needed to restart userspace, resulting in a freeze of a
full-screen OpenGL application (such as a compositor).
Work around this by checking DSPxSURF/DSPxBASE to see if the page flip
has actually happened. If it has, do the work we would have done when
the flip prepare interrupt comes in.
Also, add debugfs information to tell us what's going on (based on the
patch from Chris Wilson attached to bugs.fdo bug #29798).
Signed-off-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
This one is missed in last pipe control fix for sandybridge,
that really unmask interrupt bit for notify in render engine IMR.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
If our watchdog fires and we see that the GPU is idle, but that we
are still waiting on an interrupt, forcibly wake-up the waiter.
i915_do_wait_request() should not be racy, yet there are persistent
reports that 945GM hangs whilst the GPU is idle. This implies that the
hardware is not quite as coherent as the documentation claims - a write
followed by a flush is supposed to be coherent in main memory before the
flush is retired and the irq is emitted. This seems to be a sensible and
elegant guard to force the wait to timeout.
v2: Daniel Vetter pointed out that a warning would be useful to explain
why the machine appeared to stall.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Directly read the GTT mapping for the contents of the batch buffers
rather than relying on possibly stale CPU caches. Also for completeness
scan the flushing/inactive lists for the current buffers - we are
collecting error state after all.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
v2: Add the interrupt status and address.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
References:
Bug 26691 - Spurious hangcheck whilst executing a long shader over a
large vertex buffer
https://bugs.freedesktop.org/show_bug.cgi?id=26691
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Eric Anholt <eric@anholt.net>
Having two sets has made me think I caught a bug more than once now.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Eric Anholt <eric@anholt.net>